Hi: 2S MAR 711 NAVAL R£S€flflCH L OUflRTfRLy DECEMBER 1970 VOL. 17, NO. 4 OFFICE OF NAVAL RESEARCH NAVSO P-1278 Hol-ft NAVAL RESEARCH LOGISTICS QUARTERLY EDITORS H. E. Eccles Rear Admiral, USN (Retired) F. D. Rigby Texas Technological College O. Morgenstern New York University D. M. Gilford U.S. Office of Education S. M. Selig Managing Editor Office of Naval Research Arlington, Va. 22217 ASSOCIATE EDITORS R. Bellman, RAND Corporation J. C. Busby, Jr., Captain, SC, USN (Retired) W. W. Cooper, Carnegie Mellon University J. G. Dean, Captain, SC, USN G. Dyer, Vice Admiral, USN (Retired) P. L. Folsom, Captain, USN (Retired) M. A. Geisler, RAND Corporation A. J. Hoffman, International Business Machines Corporation H. P. Jones, Commander, SC, USN (Retired) S. Karlin, Stanford University H. W. Kuhn, Princeton University J. Laderman, Office of Naval Research R. J. Lundcgard, Office of Naval Research W. H. Marlow, The George Washington University B. J. McDonald, Office of Naval Research R. E. McShane, Vice Admiral, USN (Retired) W. F. Millson, Captain, SC, USN H. D. Moore, Captain, SC, USN (Retired) M. I. Rosenberg, Captain, USN (Retired) D. Rosenblatt, National Bureau of Standards J. V. Rosapepe, Commander, SC, USN (Retired) T. L. Saaty, University of Pennsylvania E. K. Scofield, Captain, SC, USN (Retired) M. W. Shelly, University of Kansas J. R. Simpson, Office of Naval Research J. S. Skoczylas, Colonel, USMC S. R. Smith, Naval Research Laboratory H. Solomon, The George Washington University I. Stakgold, Northwestern University E. D. Stanley, Jr., Rear Admiral, USN (Retired) C. Stein, Jr., Captain, SC, USN (Retired) R. M. Thrall, Rice University T. C. Varley, Office of Naval Research C. B. Tompkins, University of California J. F. Tynan, Commander, SC, USN (Retired) J. D. Wilkes, Department of Defense OASD (ISA) The Naval Research Logistics Quarterly is devoted to the dissemination of scientific information in logistics and will publish research and expository papers, including those in certain areas of mathematics, statistics, and economics, relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations. Information for Contributors is indicated on inside back cover. The Naval Research Logistics Quarterly is published by the Office of Naval Research in the months of March, June, September, and December and can be purchased from the Superintendent of Documents, U.S. Government Printing Office, Washington, D.C. 20402. Subscription Price: $5.50 a year in the U.S. and Canada, $7.00 elsewhere. Cost of individual issues may be obtained from the Superintendent of Documents. The views and opinions expressed in this quarterly are those of the authors and not necessarily those of the Office of Naval Research. Issuance of this periodical approved in accordance with Department of the Navy Publications and Printing Regulations, NAVEXOS P-35 Permission has been granted to use the copyrighted material appearing in this publication. A GENERALIZED UPPER BOUNDING METHOD FOR DOUBLY COUPLED LINEAR PROGRAMS James K. Hartman Naval Postgraduate School Monterey, California and Leon S. Lasdon Case Western Reserve University- Cleveland, Ohio 1. INTRODUCTION The constraints of large linear programs can often be partitioned into independent subsets, except for relatively few coupling rows and coupling columns. The individual subsets may, for example, arise from constraints on the activity levels of subdivisions of a large corporation. Alternatively, such blocks may arise from activities in different time periods. The coupling rows may arise from limitations on shared resources or from combining the outputs of subdivisions to meet overall demands. The coupling columns arise from activities which involve different time periods (e.g., storage), or which involve dif- ferent subdivisions (e.g., transportation or assembly). The case with only coupling rows or coupling columns, but not both has received much attention [8], [9]. A smaller amount of work has been done on the problem with both coupling rows and columns. Ritter has proposed a dual method [2], [7]. Except for the preliminary work of Webber and White [10] and Heesterman [4], there is no primal algorithm which exploits the structure of this problem. There is a need for such an algorithm, since such problems occur often in practice. A primal method is desirable since in large problems slow con- vergence may force termination of the algorithm prior to optimality. The algorithm proposed here is an extension of the generalized upper bounding method for prob- lems without coupling columns proposed in [5] and [6]. It produces the same sequence of extreme point solutions as the primal simplex method, and hence has the desirable convergence properties of that algorithm. However, the operations within each simplex iteration are organized to take maximal advantage of problem structure. Because of this structure it is possible to perform the computations while maintaining a compact representation of the basis inverse. In particular it is sufficient to main- tain and update at each cycle a working basis inverse, inverses from each block, and possibly another matrix, V in (4). The dimension of the working basis need never be more than the number of coupling rows in the problem plus the number of coupling columns in the current basis. Hence its dimension may change from cycle to cycle. At most one of the block inverses need be updated at any interation. Given these quantities, all information needed to carry out a simplex iteration can easily be obtained. Further, efficient relations are given for updating the working basis and block inverses and V for the next cycle. A significant amount of computational work has been done on a special class of production and inventory problems. Problems as large as 362 rows by 3225 columns were solved. Computation times 411 412 J. K. HARTMAN AND L. S. LASDON are encouraging, although no comparisons with other methods are available. The results indicate that the algorithm is sensitive to the dimension of the working basis and that effective measures can be taken to minimize this dimension. When there are only coupling rows, the algorithm simplifies considerably and reduces to the procedure described in [5]. When each diagonal block contains only a single row, it reduces further to the generalized upper bounding method of Dantzig and Van Slyke [1]. 2. BASIS STRUCTURE The problem considered here is subject to minimize .toi B x + ^AiXi= b Di.x + BiXj = bt i = 1 , where X{ is a vector with rn components, all of which must be nonnegative except for xoi, the first com- ponent of x - The column corresponding toxoi is all zero except for a 1 in the first row. This first row of the constraint matrix then defines the objective function. In matrix form the constraints can be represented: Xi X-l Bo A, A 2 A ft A> —~^-~' B 2 Xp — l Xp A p -\ A P No. of rows = b m = h mi = b 2 m 2 (1) Dp-x D D 4> Jo Jl -J2 No. of columns n rii n-i Bp-i "— ' B P — bp-i TTlp-i = b P rn p total = M rows On-l Sr rip-i rip total= N columns The problem has p diagonal blocks of dimension m; X m and an L-shaped border consisting of m coupling constraints and n coupling variables. The set Si denotes either the variables in the vector Xi or the columns of (1) associated with these variables. The total constraint matrix has dimension MXN. We assume throughout that this matrix has rank M so that any basis matrix for the system will contain M columns and will be nonsingular. DOUBLY COUPLED LINEAR PROGRAMS 413 Consider the structure of any basis, B, for the system of constraints (1). Arranging the columns of B in the same order as in (1), yields the basis matrix in Figure 1 where the shaded areas contain nonzero entries. The column for .r i is always in the basis and will always be the first basic column. t o + l es, eS 2 eS 4 eS 5 COUPLING COLUMNS es„ es 6 €S 7 Figure 1 In this example the problem has seven blocks. There are no columns from block 3 in this particular basis. The basis matrix consists of q rectangular blocks, q =£ p, roughly along the diagonal with borders from the coupling rows and coupling columns. The rectangular blocks may be either "tall," "square," or wide. The algorithm to be developed depends on having the lower right hand partition of B consist of square nonsingular blocks along the diagonal. Our strategy for handling the nonsquare blocks will be to rearrange the rows and columns of the basis matrix as follows: a) For blocks with a column excess, move the extra columns over beside the coupling columns leaving a square block. b) For blocks with a row excess, move the extra rows up with the coupling constraint rows leaving a square block. c) Assure that the resulting square diagonal blocks are nonsingular. Performing these operations on the basis B in Figure 1 will give the matrix in Figure 2, where we have m COUPLING ROWS [excess ROWS r FROM BLOCKS 3, 4, AND 5 S ROWS IN BLOCK DIAGONAL SECTION l + I EXCESS S COLUMNS IN COUPLING COLUMNS BLOCK DIAGONAL COLUMNS FROM S|,S 5 , SECTION €S„ AND Figure 2 414 .1. K. HARTMAN AND L. S. LASDON labeled the resulting nine submatriees for reference in Theorem 1. The submatrix y 3 contains square nonsingular blocks and hence it is nonsingular. For this example block 5 was still singular after it was made square, so extra row-column pairs were removed from it until the remaining block became non- singular. This introduced more excess rows and columns and created nonzero elements in the sub- matrix fa. For computational purposes the block diagonal section, y 3 should be as large as possible, but still nonsingular. A lower bound on its size is now derived, THEOREM 1: Let S be the dimension of the block diagonal section 73 of Figure 2, and M the dimension of the entire basis matrix. Suppose that the partitioning has been done so all diagonal blocks are nonsingular, and that there is no alternate partitioning which would give larger nonsingular blocks. Then S 3= M — 1 — m . PROOF: Let any basis matrix for (1) be partitioned as illustrated in Figure 2; B Oil ft y\~ a 2 ft y-i «3 0a 73 _ The block diagonal submatrix y 3 is nonsingular, and hence has rank S. Consider the submatrix [fa, 73] ■ It also has rank S since it has S linearly independent columns (those of y 3 ) and only S rows. Suppose there is a row [/3, 7] of the submatrix [/3 2 , 72] which is not a linear combination of the rows of [/3 3 , 73] and, say, this row is an excess row from block). By the special structure of this row (7 is zero except "J3 Z in block j) there must be at least one excess column in L/83J from the same block j, (otherwise the row [/3, 7] could not be independent). Since the row [/3, 7] is independent of the rows in the 7th block of [fa, 73] , this row and one of the excess columns can be adjoined to the,/th block to give a larger square nonsingular block than before. But this contradicts the hypothesis of the theorem. Hence every row of [j3>, y 2 ] is a linear combination of rows of [/3 3 , 73]- This proves that the submatrix K = fa 72 ./3s 73 has rank S. Adjoin the first / +l columns to K giving the submatrix L — L has rank a 2 fa y> . a 3 fa 73 J ^S + l since only / +l columns were added and the first of these is all zero in these rows. Finally adjoin the first m rows to L giving the entire basis matrix B «i /3i 71 «2 02 72 _«3 fa 73 Then B has rank =s S + l + m since only m rows have been added to L and hence there can be at most m more independent rows in B than in L. But rank B = M, so that M «s S + l + m , orS 3= M — l — m completing the proof of Theorem 1. DOUBLY COUPLED LINEAR PROGRAMS 415 3. THE ALGORITHM Our approach to the solution of doubly coupled problems will be to apply the revised primal simplex method to the problem. The special structure of the basis matrix will be exploited to simplify the computational and storage problems which occur for large problems. The revised simplex method involves the following steps at each iteration: a) Find the simplex multipliers 7r = c/ ; B~ 1 b) Price out the nonbasic columns Pj, Cj=c j -irP j and choose a column P s with negative c s to enter the basis if the current solution is not optimal. c) Transform the entering column in terms of the current basis, A = B"'P S . d) Determine the column to leave the basis, . X Bi X Hr a is Clrs where a,. s is component i of P s . e) Pivot to update B" 1 and the current solution to account for the basis change. Steps a) and c) involve multiplication by the basis inverse matrix B" 1 . To maintain B~' for large problems requires extensive storage and computation, since even though B has special structure, B ' is essentially dense with nonzero elements. Instead of performing the computations in a) and c) directly, we will solve the equations (2) and (3) 7rB = C/j (solve for it) BP s = P s (so\veforP s ) by making a transformation of the basis matrix B— »R where R is block triangular. The resulting equations in terms of R will make it possible to exploit the special structure of B. Consider a nonsingular matrix T defined so that BT= R is upper block triangular, B (4) // />', / V I R B H B x where B\ is the block diagonal section y 3 , G, H, and J are the remaining partitions of B e.g. G on fix and 416 J- K. HARTMAN AND L. S. LASDON (5) V=-B V i J. The matrix B is central to the procedure and will be called the working basis. From (4) and (5) we see that (6) B = G + HV=G-HB^J. By Theorem 1, B can always be partitioned so that the dimension of B is at most m + U. THEOREM 2: The working basis B is nonsingular. PROOF: R is nonsingular since both B and T are. Since all elements of R below B are zero, B must be nonsingular. To find the simplex multipliers, consider Eq. (2). Multiply through by T giving (7) 7rR = 7rBT=c B T=(l,0, . . ., 0)T=(1,0, . . ., 0). The last equality holds since only .toi appears in the objective function, and by convention .toi is always the first basic variable. Since T is nonsingular, this multiplication will not change the solution tt. Now partition tt as tt = (tt , ttu . . ., 7r 9 ), where TT has MS components and is the multiplier for rows in the working basis B. ttj has Sj components and is the multiplier for rows in theyth diagonal block Bj\ which is Sj X Sj and nonsingular, j= 1, . . . , q. From the structure of R in (4), Eq. (7) becomes (8) 7r o 5=(l,0, . . ., 0) (MS components), and (9) tt Hj + ttjBji = (sj components) 7= 1, . . ., q, where //, is the submatrix of H containing the Sj columns in they'th block. Solving (8) gives (10) 7T o =(l,0, . . ., 0)fi" 1 = first row of B- 1 , which can be substituted into (9) to give .(11) ir^-iroHjByS /=1, - - . ., q. Hence to obtain the simplex multipliers it suffices to know the inverse of the working basis and the inverse of each of the diagonal blocks. Next consider Eq. (3) for transforming the entering column in terms of the current basis. Define a nonsingular change of variables by (12) z=T~iP s . Then P s = Tz, so Eq. (3) becomes (13) Rz=BTz=BP s = / > s , which is a block triangular system and hence easier to solve than (3). Partition z and P s as (z , zi, . . . ., z q ) and (P S o, Psi, . . . ., P S q) , respectively. Then (13) can be written as DOUBLY COUPLED LINEAR PROGRAMS 417 (14) Bz +^HjZj=P s (15) Bj lZj = P sj ;=1, . . Solving (15) gives (16) Zj = Bj^P sj , which when substituted into (14) gives (17) z =B Ap so -±H } z\. j=i J If P s is not a coupling column, these formulas simplify since than P SJ = for all but one j( = 1, . . . ., q). Hence by (16)z/ = for all but one /' and the summation in (17) has only one term. It is now a simple matter to obtain P s from z using P s = Tz. With P s partitioned as (P s o, Psi, ■ ■ •, P S q) the structure of T in (4) gives (18) Pso = lz = z , and ( 19) P*) = Vjzo + Izj =VjZ + Zj j= 1, . . . , q, where Vj is the submatrix of V containing the rows in the 7th block. This completes the transformation of the column entering the basis. 4. CHANGING THE BASIS Pricing out columns to select the entering column and choosing the column to leave the basis are done as in the ordinary revised simplex method, so it only remains to describe the updating procedures. Since the entire basis inverse B" 1 is not needed, the updating requirements will be somewhat different from those of the ordinary revised simplex algorithm. In particular, reviewing the solution of Eqs. (2) and (3) shows that it is sufficient to update B ', fij, 1 (7= 1, . . ., q), and V at each iteration. Before describing the various cases which can occur, we derive a general result for updating the working basis inverse. THEOREM 3: If a basis change can be described by *B _1 = EB~ 1 , where *B _1 is the basis in- verse after the change, and E is any transformation matrix, then the working basis inverse can be updated by (20) *B-< = (E l +E 2 V)B-\ where E x and E> are partitions of E to be described in the proof. PROOF: From the definition and nonsingularity of R and T, in (4), (5) (21) *R-' = *T '*B ' = *T 'EB =*T ETR 1 . Here all *'ed symbols relate to the system after the basis change is made. Writing (21) in partitioned form and using partitioned inverse theorems gives 418 J. K. HARTMAN AND L. S. LASDON (22) *g -1 _*g-l*//*fl r l *#r* / _ *y I £. Et Ez Ea I V 1 B ' -b 'f/Br 1 fir 1 To update the working basis inverse, note that it appears in the upper left hand partition of *R _1 . Hence deleting all but that submatrix gives the following specialization of (22): *B I £, E-, E, E 4 I V I B (23) £, E, B VB = (E x + E-iV)B- Thus E\ + E>V is a transformation matrix for the working basis inverse. Let the columns from the block diagonal section of B be called "key" columns. The remaining columns of B are called "non-key." There are several cases to consider in the updating procedure: CASE 1: The column leaving the basis is non-key. Then the entering column can be brought into the basis as a non-key column, and the standard updating formula for B _1 is *B _1 = EB _1 . Here E is an elementary matrix, equal to the identity except in the rth column which is given by [171, 7/2, • • ., t/a/]', where f)i (24) a is ., M Vr : a rs (recall that the rth basic column is leaving the basis). Since the leaving column is non-key, partitioning E as in Theorem 3 gives E> = 0, and E\ is an M — SX M — S elementary matrix equal to the identity except in the rth column which is (t/i, . . ., t)m-s)'- To update the working basis inverse for Case 1, substitute these into (20): (25) *B-*={Ei + EiV)B l ={E x + OV)B l = E { B Hence B~ x is updated by a single pivot operation. None of the key columns in B change in Case 1, and hence none of the diagonal blocks Bj\ in B\ will change. Thus the block inverses, Bj] 1 , require no updating. To update ^recall that *V=— *Br l *J. Now *Br } — Bf 1 as shown above, and *J is different from ,/ only in the rth column which is replaced by the last S components of the entering column {P s \, ■ • -, Psq)' ■ Thus only the rth column of V will change, and this will be replaced by DOUBLY COUPLED LINEAR PROGRAMS 419 (26) ■flrH/V ., /%)' = -(;,. ■7 Zq) j which has already been calculated in transforming the entering column. This completes the updating calculations for Case 1. CASE 2: The column leaving the basis is a key column (from the block diagonal section). CASE 2a: Both the leaving column and the entering column are from the same block Bj\. Then if the entering column will leave Bj\ nonsingular, a direct pivot can be performed and the basis change can be described by *B -1 = EB"\ where E is an elementary column matrix. The E matrix differs from that in Case 1 only because the (tji, . . ., t/m)' column is now in the key section of the matrix. Thus E\ — Im-sxm-s, and Ei has only one nonzero column (t/i, . . ., tjm-s)', which is the r— (MS) th. To update the working basis inverse apply formula (20): B->=(E,+E-,V)B- (27) = /+ = /+ i7i • V)M-S V\B T7l T)M-S_\ fl-' + [v] \B vB where v is the r — (MS) th row of V. In the block diagonal submatrix B u only the r — (MS)th column will change. Hence only one diagonal block inverse, say Bj x \ will change. Bj^ will be updated by a single pivot, or equivalently, (28) *BjS = EBjS where E is an elementary column matrix formed from the elements of the new column. This change will leave Bj\ nonsingular if and only if the pivot element in this new column is nonzero. This condition must be checked to see if Case 2a can be applied. The J submatrix will not change, and only the 7th block of fir 1 changes, so in V= — Br^J only the y'th partition will change. The updated version is given by (29) %=- (*BjS)J } =-EB T Mj=EV } , so the same elementary matrix is used to update V. CASE 2b: The column leaving the basis is a key column from block Bj\. Case 2a cannot be used either because the entering column is not from the same block (it may be a coupling column) or because the nonsingularity test in Case 2a failed. Then we can avoid making Bj\ smaller only if there is an excess column (see Figure 2) from block j which can be exchanged with the leaving column. If there is such a column, say the /rth-basic column, then after the exchange *B = BE, where E is a permutation matrix, an identity matrix with the rth and Ath columns exchanged. Since E _, = E, *B' = EB '. 420 J. K. HARTMAN AND L. S. LASDON Then (20) yields *B~ 1 = (Ei + E 2 V)B- 1 , where E x is an M — S X M — S identity except for a zero in the Ath diagonal position, and Ei is zero except for a one in the Ath row and r— (M — S)th column. Hence (30) *B- 1 =(E l +E>V)B l row k, where v is the r — (MS)th row of V. This is a valid interchange if and only if E\ + E->V is nonsingular, which is true if and only if the Ath element of the row v is nonzero. If all elements of v in the excess columns are zero, then no interchange is possible and we proceed to Case 2c. If an exchange occurs, one column of the block Bj\ will change. Hence the inverse can be updated by a simple pivot (31) *fljl> = ££/,', where E is an elementary matrix. The elements needed to form the eta column in E are found in the Ath column of Vj. To update V, note that by definition *V= — *Br i *J. Only the 7th block of Br 1 changes, and a single column of J changes, but this column is zero outside the y'th block Jj. Hence only the submatrix Vj needs to be updated, (32) *Vj = -*BrS*Jj = -EB Tl i *J j . Now *Jj = Jj except in the Ath column which is replaced by the leaving column which we will suppose to be the /th column in Bj\. Thus Bjx l *Jj — —Vj except in the Ath column which becomes the /th unit vector, so that (33) *Vj = EVj except in the Ath column which is the negative of the eta column of E. This completes the updating required for an exchange of key and non-key basic columns. After the exchange the leaving column is non-key, so Case 1 can be used to bring the new column into the basis. It should be noted that Case 1 uses the elements a, s of the transformed entering column (see (24)) and the vector (z\, . . ., z q ) (see (26)). The interchange will affect these quantities. To update them interchange a rs and a^s, and replace Zj by Ezj. CASE 2c: The column leaving the basis is a key column from block Bj\, and neither Case 2a nor 2b apply. When the column leaves, the new *Bj\ will have one less column, and hence to remain square and nonsingular it must also lose a row. The process is most easily described by a repartitioning step followed by an application of Case 1. In the repartitioning step, the leaving column is shifted to the excess column partition, and some row of Bj\ is made an excess row. Without loss of generality assume that the pair being shifted is the first column and row of fin. DOUBLY COUPLED LINEAR PROGRAMS 421 The basis matrix before and after the change are given below. All elements of the matrices are identical, only the partitioning changes. (34) B = G H J Bn Br (35) B This is the original partitioning. The row and column indicated by dashed lines will be added to the non-key section resulting in the new parti- tioning scheme given below. *G *H *J *B U *e, In this new partitioning of the basis, *G has be- come larger by one row and one column, *B U has become smaller, and the other partitions change in corresponding ways. Now *B = B, but because of the changes in partition sizes, *T ¥= T and *R/R. Hence we must compute *Bu\ *B~\ and *V. *B{x is computed via standard formulas. If (36) Bn 1 w X Y z where Z contains all but the first row and column, then (37) B\-? = Z- YX w' This is possible only if W ¥" which provides a criterion for choosing the row to be shifted. t To obtain *fi -1 recall that B= *B so (38) *» - 1 = * r p- i*r- i — * r r- in - 1 = * r p- it 1 !! - 1 tin general, if the £th column of B ix is being shifted, then the /th row can be shifted if the /th element of row k\nBj x l is non- zero. There must always be at least one nonzero element in row k since Bj^ is nonsingular. 422 J. K. HARTMAN AND L. S. LASDON In partitioned form this is (39) *g-l -*B l *H*Br l *g_, / — *y I I V I B-* -fl- '//£,' Br 1 To isolate *B~* delete all but the upper left hand corner of this expression giving: (40) £-• = / I V I B* h w Y where h is the first column of — B ] HB^ ' and [W ', Y]' is the first column of B,, 1 / V 1 B -> h w Y where v is the first row of V. The vector h is computed as B- 1 h vB l vh+w (41) h= first column of B x HBx ' = B~ l H (first column of 5f') = S 1 f/(r|F|0)' =fi- i //,(r|F)', where //i is the first partition of H. We note that the repartitioning here changes all partitions of B. Hence in the expression B = G — HB\ X J for the working basis, all four factors change. Nevertheless, we need only add a "border" to B~ l to get *B~K Starting from the equation *T~ 1 = *R" 1 RT" 1 and arguing as above, a formula for updating V is obtained DOUBLY COUPLED LINEAR PROGRAMS 423 (42) \ Vx w V Y_ w where v is the first row of V, V\ is the rest of the first block of V, and V is all other blocks of V. Note that *V has one less row and one more column than V and that the only nontrivial change occurs in the first block. Performing these operations repartitions the basis matrix so that the leaving column becomes non4cey. Applying Case 1 will then complete the basis change for Case 2c. CASE 3: Using the above cases the simplex method can be performed, but the working basis will increase in dimension by one each time Case 2c occurs. Hence it may be desirable to periodically repartition the basis by moving excess rows and columns into the block diagonal section, essentially the reverse of Case 2c. Suppose we have an excess row and an excess column from block j. Consider the following submatrices a B excess row (43) Bji jth diagonal block in B\. \ excess column Bringing the row and column into the key section would mean adding them to Bji resulting in a larger jth diagonal block. (44) *B jt a /3 y Bn We already know fi,-, 1 and we want to compute the inverse of the new block. If this inverse exists sup- pose that it is partitioned as (45) Then [3] the blocks are given by (46) and the inverse exists if and only if *Bj? w X Y Z W=ll(a-pBy l iy) X = -WBBtf Y^-BtfyW Z = B^-B^yX, (47) a-QBrfy^O. 424 J. K. HARTMAN AND L. S. LASDON For computational purposes, note that — Bj^y is just the jth partition of the column in V corresponding to the column being moved. Hence to test for possible pairs to be moved requires computation of an inner product of /3 with a column of Vj. For convenience in expression, and without loss of generality, we again assume that the last row and column of the non-key section will become the first row and column of the key section. The ele- ments of *B will be exactly the same as the elements of B — only the partitions will change. The new working basis *B will become smaller. To find an expression for its inverse consider the expression (48) "R i — *»- B i __ * r B — *T-\ T 'TR Writing this in partitioned form will give the following matrix equation: (49) *B 1 _*£-,*#*£-, *£_, / _ *y I I V I B- x -fl-'flflr 1 fir 1 Taking only the first rows and columns of this equation gives: (50) *R-1 B I I V I B B where B is just B~* with the last row and column deleted. Thus to get the new working basis inverse, merely delete the last row and the last column in the current working basis inverse. To calculate *V use the equation (51) *T-' = *R-i*B=*R- I B=*R- I RT-'. Then writing this in partitioned form and proceeding as above yields the result (52) -W± Vx-Y^ where V\ is the first partition of V less the last column, and V is the remaining partitions of V less the last column. For this to be computed we must find \\> which is the bottom row of the working basis less its last component £. The working basis is not carried along explicitly, so we must compute (//. B = G + HV from (6), DOUBLY COUPLED LINEAR PROGRAMS 425 where G and H are partitions of the basis matrix B and V is updated at each iteration of the process. (53) [</», £] = last row of B = last row of G + (last row of H) ■ V = g+h • V (g = last row of G h = last row of H) = g+[B,0]-V =g+(3-Vi, and dropping the last element of this gives (//. In the above calculations the most complicated formulas were those for updating V in the two repartitioning procedures, Eqs. (42) and (52). An alternative procedure in these cases is to compute Ffrom its definition V= — B^ i J. In each case, only one partition of ^changes, say partition Vj, given by (54) *Vj=-*B Tl i *j j , with Bj\ of dimension Sj. To compute this requires sj multiplications for each nonzero column of *Jj, a total of at most (M-S)Xs'j multiplications. The updating calculations require on the order of (M — S)~Xsj multiplications. Hence they are computationally superior, but require more extensive program logic. After performing the appropriate updating procedure, we are ready to start the next simplex iteration. Thus the description of the basic algorithm is complete. 5. SPECIALIZATION TO PROBLEMS WITH ONLY COUPLING ROWS For problems which have no coupling columns, the algorithm simplifies considerably. The major simplification is that no repartitioning will ever be necessary and there will never by any excess rows. Hence Cases 2c and 3 of the updating procedure, which are the most complicated, are never needed. The working basis B in (6) will always have exactly as many rows as there are coupling constraints, and each of the blocks Bji has dimension nij (see (1)). Relations (10), (11) for the simplex multipliers remain the same as do relations (16)— (19) for the transformed entering column. In updating, only Cases 1, 2a, and 2b can occur. These remain essentially the same. The computations involving V simplify because each column of V except the first has only one nonzero partition. This specialization of the algorithm is very similar to the method proposed by Kaul [5] and is described by Lasdon in [6]. 6. APPLICATIONS TO PRODUCTION AND INVENTORY PROBLEMS Consider a corporation which wishes to schedule the production of K products at L plants for T time periods into the future. At each plant, and for each period the demand for each product is con- sidered known and must be met in that period. The operation of the /th plant in the fth time period is hence limited by these K demand constraints and also by r constraints on locally available resources (e.g., plant capacity, labor) giving rise to a constraint block with K + r rows. There are LTsuch diagonal blocks. These blocks are coupled by constraints on scarce corporate resources which are allocated across the various plants and budgeted over time (e.g., corporate capital, scarce raw materials, highly 426 J - K - HARTMAN AND L. S. LASDON skilled labor). In addition to producing for immediate demand, any plant may a) Produce a product and place it in inventory for future use. b) Produce a product and ship it to some other plant which has a shortage of that product. Each of the inventory and transportation activities gives rise to a column which couples two of the diagonal blocks. Thus we have a doubly coupled linear program to solve. The number of rows is on the order of (K+ r)LT, so truly large problems may result. There are KLT(2L + T—3)/2 coupling columns arising from the inventory and transportation activities, but these have very special structure. Each column has only a cost coefficient, a single + 1 in'the A;th demand equation for one block, and a single — 1 in the kth demand equation for another block. Hence they may be stored implicitly and priced out with minimal effort. Since these coupling activities incur a cost in addition to the production cost, we anticipate that even though there are many such activities, relatively few will be profitable. Hence in any basis there should be relatively few coupling columns so that the algorithm described can be applied. A number of computational simplifications appear. In computing the transformed entering column, at most two of the Zj in (16) will be nonzero, so (17) simplifies considerably. The special form of the coupling columns implies that in J any coupling column has only 2 nonzero elements + 1 and — 1, while an excess column has only one partition nonzero. Hence in V= — Bi*J a coupling column has at most two nonzero partitions and these are columns of the block inverses Bj\. An excess column has one nonzero partition in V. Hence it is probably best to compute V at each iteration instead of updating it. This is desirable since the most complicated update formulas are those involving V. 7. COMPUTATIONAL RESULTS The algorithm described above has been coded and used to solve a number of test problems of the type described in section 6. The program was written in FORTRAN V for the Univac 1108 com- puter at Case Western Reserve University. The special structure of these problems made it possible to solve reasonably large programs all in core. All problems were solved in single precision arithmetic. Whenever a block inverse or the working basis inverse had been updated 50 times, it was re-inverted using a standard Caussian elimination routine. Good numerical accuracy was obtained in that different runs on the same problem yielded solutions which were the same to seven significant figures. The code was not written to be competitive with commercial routines, but rather to investigate the effects of various pricing and repartitioning strategies. Nevertheless the solution times recorded are encouraging. Data describing the test problems is given in Table 1. The notation used is as in section 6. For each problem size, two problems were formulated. The lower numbered problem of each pair was constructed to be relatively easy, in that few coupling columns appear in an optimal basis. The second problem in each pair is derived from the first by adjusting the right hand side demand and resource availability vector so that more coupling activities are required. Hence it is more difficult. For all problems, phase I was initiated with a basis consisting of slack variables for all resource constraints and artificial vari- ables for demand constraints. Table 2 describes the effect of three different pricing strategies. Pricing strategy one allowed coupling columns to enter the basis at any iteration. It led to long running times and large working bases since many coupling columns tended to enter the basis in phase I, even in problems for which there were none in the optimal basis. Pricing strategy two did not allow coupling columns to enter the basis in the first M iterations unless all other columns priced out optimally. It produced a substantial reduction in running time. Strategy three, which was the most successful, allowed no coupling columns DOUBLY COUPLED LINEAR PROGRAMS 427 Table 1. Test Problem Descriptions Problem numbers Number of products (K) Number of plants (L) Number of time periods (T) Number of blocks (LT) Block size Number of coupling columns Number of coupling rows Total problem size 1,2 5 3 4 12 7x22 210 6 90 x 474 3,4 5 3 6 18 7x22 405 8 134 x 801 5.6 5 3 8 24 7x22 660 10 178x1188 7,8 5 3 10 30 7x22 975 12 222 x 1635 9, 10 5 5 6 30 7x22 975 8 218 x 1635 11, 12 5 5 8 40 7x22 1500 10 290 x 2380 13, 14 5 5 10 50 7x22 2125 12 362 x 3225 Table 2. Effect of Pricing Strategies Problem number Pricing strategy Total iterations Total time (sec) Iterations (per sec) Maximum size of working basis Maximum number of coupling columns in basis 1 1 329 18.52 17.8 31 28 3 178 5.05 35.2 9 3 3 1 440 36.88 11.9 37 34 2 326 12.99 25.1 22 15 3 274 9.39 29.2 10 4 5 1 Exceeded 60 second time limit 2 451 25.33 17.8 25 20 3 321 14.43 22.2 15 6 7 1 Not attempted 2 557 38.07 14.6 27 21 3 438 24.12 18.2 21 11 (All runs made with case 3 repartitioning attempted every 10 iterations) 428 J. K. HARTMAN AND L. S. LASDON Table 3. Effects of Repartitioning Strategies Number of occurrences of various Final Final Maxi- Maxi- Repar- Total Total cases in updating. size No. mum mum Problem tition- time Iterations number ing strat- egy tions (sec) (per sec) 1 2a 2b 2c 3 ing basis pling col- umns work- ing basis number coupling columns 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 1 178 4.95 36.0 82 96 25 5 11 11 3 2 178 5.05 35.2 81 97 26 7 17 6 9 3 3 178 4.60 38.7 81 97 26 7 35 6 9 3 6 178 4.62 38.5 81 97 25 9 53 6 9 3 2 1 299 7.71 38.8 15s 144 45 9 15 5 15 7 2 311 7.66 40.6 163 148 50 17 31 10 5 11 8 3 312 7.66 40.7 162 150 48 19 62 9 5 11 8 6 311 7.72 40.3 162 149 49 20 64 10 5 12 8 5 1 322 16.17 19.9 113 209 38 9 19 19 6 2 321 14.43 22.2 108 213 38 9 31 10 15 6 3 321 13.49 23.8 108 213 38 9 64 10 15 6 6 321 13.46 23.8 108 213 38 9 174 10 15 6 6 1 711 45.97 15.5 361 350 84 26 36 18 36 23 2 733 39.81 18.4 379 354 82 63 73 22 16 25 20 3 706 38.16 18.5 349 357 76 59 141 24 18 26 22 6 675 36.33 18.6 322 353 69 78 426 23 17 28 23 9 1 421 24.99 16.8 192 229 62 13 21 4 21 12 2 416 21.96 18.9 188 228 55 34 41 12 6 17 13 3 416 21.66 19.2 188 228 55 34 83 10 6 17 13 6 416 21.38 19.5 188 228 55 32 132 10 6 18 13 10 1 1023 62.14 16.5 638 385 200 25 33 15 33 26 2 1034 54.13 19.1 656 378 175 77 103 21 15 29 26 3 1025 54.78 18.7 646 379 177 77 205 21 15 26 23 6 1023 53.01 19.3 650 373 171 85 705 21 15 27 22 11 1 570 46.35 12.3 277 293 84 15 25 5 25 14 2 565 38.43 14.7 263 302 86 39 56 12 5 21 16 3 565 38.42 14.7 263 302 86 39 113 12 5 21 16 6 577 38.78 14.9 285 292 86 40 244 11 5 21 16 12 1 1194 92.68 12.9 748 446 233 28 38 17 38 23 2 1209 84.14 14.4 764 445 217 76 121 23 16 30 26 3 1208 89.49 13.5 766 442 225 81 241 25 17 33 27 6 1231 87.02 14.1 784 447 220 98 900 25 18 33 26 13 1 754 96.01 7.9 390 364 108 26 38 7 38 19 2 747 69.41 10.8 372 375 102 55 . 74 14 9 26 19 3 751 68.44 11.0 377 374 99 61 150 14 9 26 19 6 769 73.02 10.5 397 372 104 72 372 17 9 30 22 14 1 1364 137.38 9.9 787 577 258 81 43 20 43 26 2 1390 123.29 11.3 790 600 238 79 139 28 20 33 26 3 1389 118.84 11.7 807 582 247 82 278 27 20 34 28 4 1351 119.74 11.3 750 601 237 89 1351 29 21 35 27 5 1379 126.08 10.9 789 590 248 79 414 30 21 37 27 6 1351 113.60 11.9 750 601 237 89 1066 29 21 35 27 DOUBLY COUPLED LINEAR PROGRAMS 429 to enter the basis (unless necessary) until all artificial variables have left the basis. Pricing strategy three was used for all of the remaining runs. All problems in Table 2 were solved with the repartitioning procedure of Case 3 attempted every 10 iterations (see section 4). Table 3 shows the effects of various repartitioning strategies, i.e. strategies for employing Case 3. The strategies tested are to attempt Case 3: 1. never 2. every 10 iterations 3. every five iterations 4. every iteration 5. whenever there are at least 10 excess columns 6. whenever there are at least five excess columns. The different strategies often gave rise to slightly different numbers of iterations, probably because different strategies result in different orderings of the rows in the problem. Hence if the basis is de- generate, ties may be broken in different ways, and different columns will leave the basis. Using the strategy of never employing Case 3 involves a tradeoff. The computations of Case 3 never have to be performed, and Case 2c is performed less often. This is because excess columns accumulate in the non-key section so the interchange of Case 2b is more likely to succeed. However if Case 3 is not performed, then the dimension of the working basis increases whenever Case 2c is performed, and never decreases. The overall result is that the number of iterations per second is lowest among all strategies tested. Hence it is desirable to repartition the working basis periodically. Among the other strategies tested, no consistent differences emerged. REFERENCES [1] Dantzig, G. B. and R. M. Van Slyke, "Generalized Upper Bounded Techniques for Linear Pro- gramming," Journal of Computer and System Sciences 1, 213-226 (1967). [2] Grigoriadis, M. D. and K. Ritter, "A Decomposition Method for Structured Linear and Nonlinear Programs," Journal of Computer and System Sciences, 3, 335-360 (1969). [3] Hadley, G., Linear Algebra (Addison-Wesley, Reading, Mass., 1961). [4] Heesterman, A. R. G., "Special Simplex Algorithm for Multisector Problems," Numerische Mathe- matik 12,288-306(1968). [5] Kaul, R. N., "An Extension of Generalized Upper Bounded Techniques for Linear Programs," ORC-65-27, Operations Research Center, University of California, Berkeley (1965). [6] Lasdon, L. S., Optimization Theory for Large Systems (Macmillam Company, New York, 1970). [7] Ritter, K., "A Decomposition Method for Linear Programming Problems with Coupling Constraints and Variables," MRC No. 739, Mathematics Research Center, University of Wisconsin (1967). [8] Rosen, J. B., "Primal Partition Programming for Block Diagonal Matrices," Numerische Mathe- matik 6, 250-260 (1964). [9] Sakarovitch, M. and R. Saigal, "An Extension of Generalized Upper Bounding Techniques for Structured Linear Programs," SIAM Journal on Applied Mathematics 15, 906-914 (1967). [10] Webber, D. W. and W. W. White, "An Algorithm for Solving Large Structured Linear Program- ming Problems," IBM New York Scientific Center Report No. 320-2946 (1968). NONLINEAR PROGRAMMING THE CHOICE OF DIRECTION BY GRADIENT PROJECTION' Philip B. Zwart Washington University St. Louis, Missouri ABSTRACT Rosen's method of Gradient Projection chooses a search direction which is not neces- sarily the direction of steepest ascent. However, the projection of the gradient onto a "suit- ably chosen subspace" does yield the direction of steepest ascent. The suitable choice is easily recognized as a result of some theorems relating gradient projection to steepest ascent. These results lead to a modification of Rosen's method. The modification improves the choice of search direction and usually yields the steepest ascent direction without solving a quadratic programming problem. I. INTRODUCTION The general continuous nonlinear programming problem can be stated as: Find x*=(x*, . . ., x*) which yields the maximum value of F(x) among all x which satisfy <f>i(x) 2*0, i=l, . ■ ., A-., i.e., (1) MaxF(x)., subject to: </>,(*) ** 0, i = 1, . . ., k. The F and $i's are assumed to be concave and twice continuously differentiable. Concavity is usually required for theoretical discussions of convergence. Numerical approaches to solving (1) can often be successfully applied even if concavity is lacking. This is true for the material discussed herein. Gradient Projection, first discussed by J. B. Rosen [2], [3], is a numerical iterative procedure for solving (1), (or attempting to solve (1)) by a sequence of one dimensional searches made in the direction of the "projected gradient." This paper indicates a weakness in Rosen's method, gives some mathematical results relating gradient projection to steepest ascent, and describes a small modification, which follows naturally from the mathematical results and overcomes the indicated weakness. Section II is devoted to notation used throughout the paper. Section III gives a rough description of Rosen's method and a detailed description of those aspects with which this paper is concerned. Section IV shows that all "active" constraints should not necessarily be used in projecting the gradient. Section V reviews some linear algebra which may make the mathematical proofs more lucid. Section VI presents two theorems which relate the "suitably projected" gradient direction to the direction of *This research was supported in part by the Atomic Energy Commission under Research Contract No. AT(1 1 — 1 )— 1 493 . and by the Department of Defense under THEMIS Grant No. F44620-69-C-0116. 431 432 p - B - ZWART steepest ascent. Section VII presents the suggested modification to Rosen's method. Section VIII re- lates this material to the methods of feasible directions discussed by Zoutendijk. Concluding remarks are given in Section IX. II. NOTATION Let F and <£,, i— 1, . . ., k be as described in Section I. Then VF(x), V </>,(*) denote the respec- tive gradients at the point x= (x\, . . . , x n ). If I q ={i], . . ., i q } is a set of indices taken from 1, . . ., k, then P q {x) denotes the projection operation which maps any vector into its projection onto the subspace perpendicular to the subspace spanned by the V (/>,(*), ielq. For instance, P q (x)\/F(x) is the projection of VF(x) onto the subspace perpendicular to the subspace spanned by the S/tpiix) , ielq. A unit vector in the direction of P q (x)S7F(x) is denoted by d q (x). Finally, in cases where the argument x is obvious, the "(x)" is omitted. Thus, VF, V (/>,-, P q , d q are used in place of VF(jc), V (/>,(*), Pq(x) , d q (x). III. ROSEN'S GRADIENT PROJECTION METHOD Rosen's method of choosing a search direction is described below. It is assumed that the feasible point x is given (this point is usually the result of a one-dimensional search starting from some other point). We are only concerned with how a direction is chosen for the next one-dimensional search. Let I q —i\, . . ., i q be the set indices i for which |</>,-(;c)| < 8,. The Si's are preassigned small numbers. It is assumed that the V<f>i(x), ielq are linearly inde- pendent vectors. Rosen's Gradient Projection method chooses the search direction as follows: We have P q {x) VF(x) = VF(x) - £ r,-V<M*)- k 'q (Note: The vector r with components n, ielq, is given by r= (N q N q )~ 1 N' q ^7F(x) , where N q is the matrix whose rows are the vectors V</>i(.v), i^q-) 1. If \\P q {x) S7F(x) || is "very small" (described in detail in Rosen's papers) and n =S 0, iel q , then do not search any more. 2. If ||P Q (;t)VF(x)|| is not "small" (described in detail in Rosen's papers), then use the direction of P q (x)S7F(x) as the search direction. 3. If \\P Q (x)S7F(x)\\ is "small" (described in detail in Rosen's papers) and some r, > 0, ielq, then drop the index / from the set I q to obtain a new set I q -\. Use the direction of P q -i, (x) VF(ac) as the search direction. The index / is determined by WMx)\r™™\WMx) The details of what is meant by "very small" and "small" are not included here because the discussion in this paper is independent of these details. This paper is concerned with steps 2 and 3. Steps 2 and 3 say that / is dropped from the set of indices I q , only if \\P q (x)'\7F(x)\\ is small. Actually, as will be shown below, a better search direction (i.e., one which yields a greater rate of increase of F) can be obtained by dropping /, whenever n max llv<M*)ll w f l|v<M*)||j is positive. NONLINEAR PROGRAMMING GRADIENT PROJECTION 433 IV. ACTIVE CONSTRAINTS Fig. 1 illustrates a situation in which the projected gradient direction, while acceptable, is certainly not the best choice of search direction. Clearly, a greater rate of increase in F is obtained by searching in the gradient direction. In such a situation, Rosen's method would choose the poorer search direction. FEASIBLE REGION P,(X)VF(X) Figure 1. A projected gradient. Fig. 2 illustrates another situation in which Rosen's method would make a poor choice of search direction. The vector, P->(x)VF(x), the projection of S/F(x) onto the intersection of planes Xi — 3.T 2 + .t 3 = 18 and 3xi — x-z + x 3 = 20 would be chosen as a search direction by Rosen's method. Clearly, Pi(x)VF(x), the projection of S7F(x) onto the plane 3xi — ^ 2 + ^ 3 = 20 gives an acceptable search direction which yields a greater rate of increase in F. X 3 "X|-X 2 +IOX 3 slOO VF(X) FIGURE 2. Alternate projections of gradient The situation in Fig. 1 may seem unreasonable because one would not expect some previous one- dimensional search to stop at x. However, the situation in Fig. 2 is reasonable. The point x could very well be the result of a previous one-dimensional search along x\ — 3x 2 + x 3 = 18. Such a one-dimensional search would stop at x because further searching would violate the constraint — 3*i + x-> — x.\ 3= 20. 434 p - B - ZWART Let us designate a constraint, 4>i 2= as "active" at x, if |0,-(.v)| =£ 8,-, where the Si's are preas- signed small numbers. In other words, the ith constraint is considered active at x if the hypersurface <4, = appears to be close to x. (A better estimator of closeness would be . ' ' ■ . -, But this is not \\\/4>i{x)\\ the primary concern of our discussion.) Rosen's method chooses the next search direction by taking the projection of \7F{x) onto the intersection of the hyperplanes which are tangent to the surfaces <}>i(x) = (f)i{x), for all active constraints (except when this projection is "small"). The examples in Figures 1 and 2 indicate that sometimes it is better to take the projection of S7F(x) onto an intersection of a subset of these tangent hyperplanes. The remainder of this paper describes how to recognize this situation and what to do about it. V. PROJECTIONS Let us recall a few facts from linear algebra, which are used in the proofs of Theorems 1 and 2. Let W\ D W-i be linear subspaces of £". Let Pi and P 2 be the projection mappings associated with W\ and Wi respectively. For any vector A, (1) PiA PiA=PiAA ^AA, i=l, 2 with < holding whenever A +PA. (2) P,P l A = P 2 A. (3) For any vector BeW\, A • P A A • B no a\\ ^ iidii with = holding only when B = aP\A, a real and positive. \\PiA\\ IpII These facts follow readily from the fact that A —PiA is perpendicular to Wi, i=\, 2. VI. STEEPEST ASCENT VS PROJECTED GRADIENT The steepest ascent direction at a point x is given by the unit direction vector, y, which maximizes the rate of change of F{VF(x) • y) and does not violate the hyperplanes which are tangent to the active constraint surfaces (i.e. V <£,-(.*) -y 3= 0, for active </>,-'s). Thus, the steepest ascent direction is given by that vector d which solves the following nonlinear programming problem: 2 MaxVFUW subject to: S7(f>i(x) -d 3= 0, for (f>, ^active, i.e. iel d-d=l. This direction is the locally best search direction in the case of linear constraints. (Of course, any one-dimensional search in this direction must be accompanied by small corrections to insure that the nonlinear constraints, $, 5* are not significantly violated during the search.) Hadley [1, p. 301] mentions that although the solution to (2) can be obtained (by solving a quad- ratic programming problem), simpler procedures can be used to find a d which satisfies the constraints in (2) and has VF(x) • d> 0. Gradient Projection is one of several methods for finding such a d. The following theorem indicates that the solution to (2) is given by the projection of the gradient onto a suitable subspace. THEOREM 1: The solution to (2) must be a unit vector having the direction of P q (x)S7F(x) for some /,= {i\, . . ., i q ). PROOF: Let d be the solution to (2). PaVF Set I q = {i\\/4>j-d — 0}, and d q - linVFH NONLINEAR PROGRAMMING GRADIENT PROJECTION 435 We will show that d—d q . Suppose not. Then, since d is a unit vector with S/(f>i-d — 0, iel q , we have by V(3), (3) VF-d = P Q VF-d<P Q VF--^^ l <P q VF-d q =\7F-d q . Now, for any y > 0, V(f> r (d+yd q ) = 0, iel q . And, since V(f>rd> 0, i^I q , y can be chosen small enough so that \/(f)i-(d + yd q ) >0, i$I q . Let y > be such a choice of y. Then, d= ,, , . - ^.i is a feasible vector for (2). \\d+yd q \\ Moreover, This last inequality contradicts the fact that d is the optimal solution to (2). Hence, we must have that d=d q . Q.E.D. Theorem 1 implies that (2) can be solved by suitably determining I q . If we assume that the V$,'(x), iel (i.e., the gradients of the active constraints) are linearly independent then the next theorem char- acterizes the proper choice for I q . THEOREM 2: P q (x)VF(x) (=VF(*) -]£ T$7<f>i(x)) gives the direction of the solution to (2) if and only if ' e/ <? i) P q (x) VF(*) • V<M*) > 0, if/,, and ii) n =£ 0, te/,. P VF PROOF: Set d,-- r,v/ni Sufficiency. Let d be any feasible vector for (2). Then, by use of the equation in the statement of the theorem, VF-d = P q VF-d+^ nV<t>i-d =s P q VF ■ d *£ \\P„V F\\= Ptf F ■ d q = V F ■ d q Therefore, d q maximizes VF • d. Necessity. Condition (i) is necessary because d q must be a feasible solution to (2). Suppose {ii) does not hold. Let jel q be such that r, > 0. We will show that there is a feasible J for which VF-d q <VF-d. First, let /,- = {iel\ V0, ■ F„VF = 0}. 436 p - B - ZWART Notice that P q .VF = P q .P q V F = P q VF. The first equality follows from V(2) and the fact that I q C I q '. The second equality follows from the method of constructing I q *. Set I q '-\ = {i\iel q ' and i + j}. We claim (a) S7(f>j ■ dq.-i 2* and (b) VF ■ d q >-i > VF ■ d q . To see (a), we observe that P q ^F ■ P q KF=P q .-NF ■ P q .VF =P q ,- l VF-P q VF -P^-.VF-VF-^ nP q '-iVF • V0i = F 9 ._,VF • S/F-rjPg'-i^F ■ V<fo. = P,._ 1 VF-/ > ,.-,VF4-r / P ff ._,VF-V0 j i.e. ||P,.VF|| 2 =||P,'-,VF|| 2 -rjP,._,VF-V0j. i.e. P,--iVF-V0j = - (IIFq'-jVFU'^-IIF^VFll 2 ) 2=0. Where this last inequality follows from the fact that r, > and /q- C I q >-\. Assertion (a) now follows because rfq'-i is a unit vector in the direction of P^-.VF. To see (b), first notice that if P 9 '-i VF = F 9 'VF, then we would have VF- ^ s,V<£; = VF-2 nV^i i.e., rjV<f>j(x)= £ s,-V0,— £ rrV^-rj + O. f+j i*j This last statement contradicts the linear independence of the V </>,-. Thus,P 9 '_iVF 4= P Q '^7F, and since P q .VF = P q .P g .-iVF, by V(2), we must have ||P,-_,VF|| > ||P,-VF||, by V(l). But ||P,VF|| = P.-VF ■<*,-, i= 9 ', 9 '-l. So P,-_,VF-d,'-,>P 9 '_,VF-d r -, i.e.,VF-d,<_, >VF-d,- But d Q ' = d q because, as noted above, P q 'S/F = P q S7F. This gives us (b). NONLINEAR PROGRAMMING GRADIENT PROJECTION 437 From the definition of /,,< and I q '-\ it follows that V</>, ;■ d q = 0, iel q ', and V$, ;• d q >-\ = 0, ieI q <-\. This, combined with assertion (a), says that, for y > 0, V (/>,•• (d, + yd,._i) 2=0, te/q'. Then, since V</>, • <f q > 0, iel — I q ', we can choose y = y small enough so that V<£, ■ (d q +yd Q '-\) > 0, iel — l q >. Thus j d g + yd q <-i d= n,j _l-^ FT is a feasible vector for (2). Also, using (b), we have V ^ • d= I. , , h FT ( VF • d q +yVF ■ <*,-_,) ||rfq+ya 9 '-i|| > .I , * fi (l + y)VF-d, > VF • <f 9 . We have a contradiction to d q being a solution to (2). This completes the proof of the necessity. Q.E.D. VII. MODIFIED GRADIENT PROJECTION This section describes a simple modification to Rosen's method of gradient projection. This modification helps to overcome the deficiencies which were described in Section IV. The modification is justified by the following corollary to Theorem 2. COROLLARY TO THEOREM 2: If /„ = / and P 9 VF = VF-^ r,V0 ( , where some rj > 0, then it l d q -i(I q - l =I q — {j} is a feasible solution to (2) with VF ■ d q -i > VF ■ d q . (i.e., P q -iS/F gives a better search direction than F q VF.) PROOF: This result follows from the proof of the necessary condition in Theorem 2. The above corollary suggests the following modification to the method of gradient projection. Steps 2 and 3 of Section III should be replaced by the following step: 2'. If some r, > 0, iel q , then drop the index / from the set I q to obtain I q -\. Use the direction of P 9 _iVF(x) as the search direction. The index / is determined by ri [ r, max |v</>,u)ll "«7;ilrv<M*) This modification gives an improved search direction in the sense that it gives a direction of greater rate of increase. In addition, if the recalculated r,'s are all negative, then a "steepest descent" direction (the solution to (2)) is obtained. The termination of a search iteration usually involves the encountering of no more than one new constraint surface. Thus, permitting one index to be dropped should usually maintain the r,'s as negative. Of course, this will not always be the case. 438 p - B - zwart VIII. Zoutendijk's Discussion G. Zoutendijk [4] discusses several algorithms for handling nonlinear programming problems. The algorithms are called "Methods of Feasible Directions." These algorithms fall into the general framework of a sequence of one-diminsional searches. They differ primarily in the method of choosing the search direction. The choice of search direction involves the use of a normalization. Different normalizations yield different search directions and, consequently, different algorithms. Zoutendijk's normalization Nl [4, p. 70] together with a choice of 0, = O, iel c (xi,-), for determining S'(xk) [4, p. 68, Eq. 7.3.6] yields a search direction which corresponds to the direction of steepest ascent. Of course, choosing 0, = O, does not fit in with Zoutendijk's methods because he chooses to avoid the necessity of calculating correction steps during the one-dimensional search. Nevertheless, Zoutendijk's discussion of normalization Nl is closely related to the results of this paper. In his discussion of normalization Nl [4, section 8.2, p. 80], Zoutendijk mentions that the re- sulting direction finding problem is of the same type as (2) above [4, p. 81, Eq. 8.2.3]. Thus, his results and comments are applicable to the solution of (2). He discusses four methods of solving this problem. Two of the methods are finite and are based on the work of Charnes and Dantzig, and Orden and Wolfe. The other two methods devised by Zoutendijk are not finite, but are expected to be usually more efficient. This serves to illustrate that there is evidently no easy way to solve (2). Thus, the small modification to gradient projection proposed in section VII of this paper can be considered as a compro- mise between either solving (2) exactly or accepting the first gradient projection as proposed by Rosen. IX. CONCLUSION The direction of steepest ascent in constrained problems has been shown to be the direction given by the projection of the gradient onto a suitably chosen subspace. Moreover, the suitable choice is recognized by the negativity of certain coefficients which arise during the calculation of the projected gradient. The above results give rise to a slight modification to Rosen's gradient projection method. The modified projection method usually yields a search direction which is the direction of "steepest ascent." The modified method when employed has the following favorable qualities: 1) The new projected gradient direction is better (has a greater rate of increase of the objective function) than the original projected gradient direction. 2) The number of constraints being followed is reduced. Thus calculations of correction steps and future projections are simplified because the matrix N%N q (See [3]) is smaller. REFERENCES [1] Hadley, G., Nonlinear and Dynamic Programming (Addison-Wesley, Reading, Mass., 1964). [2] Rosen, J. B., "The Gradient Projection Method for Nonlinear Programming, Part I," SIAM Journal 8, 181-217 (1960). [3] Rosen, J. B., "The Gradient Projection Method for Nonlinear Programming, Part II", SIAM Journal 9, 414-432 (1961). [4] Zoutendijk, G., Methods of Feasible Directions (Elsevier Publishing Company, Amsterdam, 1960). SIMPLE STOCHASTIC NETWORKS: SOME PROBLEMS AND PROCEDURES* J. M. Burt, Jr. University of California Los Angeles, California D. P. Gavert University of California Berkeley, California and M. Perlas Carnegie-Mellon University Pittsburgh, Pennsylvania 1. INTRODUCTION An extensive literature now exists describing and solving various problems related to aspects of project graph analysis. Attention is called in particular to the papers of Charnes, Cooper, and Thompson [2], Hartley and Wortham [5J, Kelley [7], Martin [9], and Jewell [6], although there are many others as well. We mention also the systematic book-length treatment of Moder [10]. Initially, project graph analysis (also mnemonically called PERT, GERT, CPM, etc.) assumed that individual task completion times were fixed and known in advance. The unreality of such an as- sumption in many contexts is apparent. Consequently attempts to introduce the probability distribu- tions of completion times have been made; noteworthy for such work are [2], [5], and [9] above. Thus, for example, one may be interested in the expectation or other summary of the entire project's comple- tion time as the latter depends upon the distributions of the inidvidual task times. Alternatively, it may be of interest to seek the probability that a particular path will be "critical," i.e., will be the last to be completed. Decision problems of one sort or another may be considered; cf. [2] and [6]. This paper deals with stochastic network problems of the sort just mentioned. Since most stochastic networks of realistic scale are practically impossible to analyze mathematically, we begin by discussing simulation, in particular illustrating the gains possible by use of Monte Carlo approaches for improving naive "straightforward" simulation techniques. Considerable improvement in simulation efficiency can be made if an approximate, analytically tractable, model is available, so we next introduce such models and illustrate their use as a "control variate." The analytical network models suggested utilize families of task length (link time) distributions based on the simple exponential density, and profit by the Markovian or memoryless property associated with the exponential. *This research was conducted in part under National Bureau of Standards Contract CST 136 at the Operations Research Center, University of California, Berkeley. The work of the authors was also sponsored by the Management Sciences Research Group, Carnegie-Mellon University, Pittsburgh, Pennsylvania. tVisiting faculty member during the year 1967-1968. 439 440 J- M. BURT, Jr., D. P. GAVER, AND M. PERLAS 2. SOME MONTE CARLO SIMULATION TECHNIQUES It seems safe to say that only a minority of the actual networks encountered in practice can be adequately studied using pencil and paper mathematics. This is not to state that analytical approaches are useless: they can, and should, be used to reduce the complexity of parts of the network prior to a computational approach, and they can offer plausibility checks. Ideally, one would like to obtain enough insight into the behavior of stochastic networks so as to largely eliminate the need for formal, perhaps computerized, studies. One way to obtain such insight is to examine various realistic networks by means of sampling experiments, using a computer. Since this is also a popular way of solving specific problems, a discussion of various techniques is given next. Our intention is to describe certain Monte Carlo methods for efficiently dealing with the randomness present. The general purpose is to obtain good determinations of, say, project completion time expectations or distributions while keeping computational effort low. The methods we discuss were first put to use in physics; see the book by Hammersley and Handscomb [4]. O : -O : KD Figure 1. A. Antithetic Variables To explain this notion, refer first to the network of Figure 1 and suppose we wish to estimate the expected completion time of T = T\ + T> by simulation, where T\ and T> are assumed to be independent. The discussion which follows is valid for any size network composed of activities whose durations are independent. This small example has been chosen solely to simplify the presentation. The straight- forward procedure is to select two uniform random numbers, R\ and R>, transform to realizations of T\ and T-z, perhaps by the familiar procedure, T, = Fr(R), where F, is the distribution of T-„ and Ff l its inverse. Then the first realization is 7x1):= yd) + jKi)- we could tabulate n such realizations using random numbers and average to obtain the estimate fU) _|_ yto _|_ y(2) _|_ f(D _|_ _|_ fin) 4. j(u) (2.i) r=-! = ! *- '— ! 2 — All random numbers are independent, and the n times are generated from their appropriate distributions, so (2.2) E[T]=E[Ti] + E[T 2 ] and (2.3) Var [f ] = Var [r,] + Var [r 2 ] SIMPLE STOCHASTIC NETWORKS 441 In other words, the simple procedure described gives an unbiassed estimator of E[T] whose variance decreases as 1/n. If the procedure described is repeated many times (n-^> °°), then T becomes arbitrarily close to E[T]. Unfortunately, if there are many independent serial links then the sum of the variances in the numerator of (2.3) becomes large, and many repetitions are required in order to deter- mine E[T] accurately. An attempt to avoid this is in order. Notice first that in order to estimate E[T] the realizations T\ j) + Vip and T^ ) + T^ need not be independent so long as they have the correct marginal distributions. Intuitively, too, one sees that if, when T\ is "large" in one realization, it is forced to be correspondingly "small" in another, then the average will tend to be closer to the true value E[T] than in the case of purely independent samples. To accomplish this, we can construct two realizations of T= T\ + T> using the same two random num- bers: first obtain R t and R>, thence T\ and T>, and finally T; next subtract Ri and R> from 1 to obtain 1 — R, = Rl (i = 1, 2)*, and from these T[ and T'i, finally adding to get T . T\ and T-> are the antithetic variates of T\ and T>. Lastly, average: yd) _|_ yd)' _|_ yd) + yd)' + _|_ y(») _|_ y(»)' _j_ fin) + fw (2.4) T A - - 1 ! 2 - ^~ 2n yd) _)_ j(i) _|_ _ _ _|_ f(n) _)_ y(«) yd)' _)_ yd)' _)_ _ _ _|_ y< »)'_)_ y( H )' = -(T+T'). 2 V ' Now by construction, E[f] =E[T '] =E[T], so the estimate f A is unbiassed. Furthermore, (2.5) Var [f] = Var [f , ] = Var [7,] +Var [P,] n However, it should be intuitively apparent that t and t' tend to be negatively correlated, i.e., cov [T, T'] < 0. (This property may be proven easily.) Since Var [f,]=|(Var [f]+Var [f])+|cov [T, f ] = WaT W +1 cov [f, f ] < Var [f] , (2.6) this means that the above simple procedure is more efficient than doubling the total number of in- dependent realizations computed. One can, of course, estimate the variance of Ta by simply computing the sample variance of the n independent averages y(o + yo)+y(0' + y(0' *-— — (*=1, 2, . . .,»). Since Ta is the average of n independent terms, approximate confidence limits may be placed on isjT] using the Student t tables. The use of a few special random variate generators, such as the Normal variate generator using trigonometric functions, should be avoided when the antithetic procedure is applied. 442 J- M. BURT, Jr.. D. P. GAVER, AND M. PERLAS The dramatic effect of antitheticizing a sum is most apparent when Tj is symmetric, e.g., if T, can be assumed uniform, or normal. In the latter case we can write (2.7) 7f=mi+ar£i, £; being distributed as N(0, 1); m, and cr, are mean and standard deviation of Tu Then the obvious antithetic is obtained by simple sign reversal of £,: (2.8) T; = mi-cnZi, and an average ot — with already yields a zero-variance estimate of m,. For the latter example, and for others as well, it should be apparent that the use of antithetic vari- ables is helpful in efficiently establishing the appearance of a distribution function. One first determines an empirical distribution, F(t), from the TVs, and a second empirical distribution, F'{t), from the TV's. The antithetic variate estimate of the true distribution would then be ,2.9, /(,) = *" + /'"> for.]],. If F(t) is overly skewed (relative to the true distribution) to one side, then F'(t) is likely to be overly skewed to the other side, and F{t) will be close to the true unknown distribution. It should be noted that estimates of Var (T) should not be formed by simply averaging the sample variances of the antithetic variates; such an estimate may be quite biased. The authors have had some success in elimi- nating this bias by application of the Tukey-Quenouille jackknife procedure. Figure 2. As a second example of the use of the antithetic technique consider a simple parallel network, e.g., Figure 2. The goal is to determine the probability with which M=max (7\, T>, . . ., 7*) is not greater than some number x. For simplicity, let all 7V s have the same rectangular distribution: T, is uniform on [0, 1]. This is no restriction in fact, for, even if the 7Vs were arbitrarily but identically distributed, M =S x if and only if R, =s y, for all i. To determine the above probability in a straightforward manner we can sample independently for each component link. Then tabulate 8, where {1 if Ri<x, R><x, . . .,Ri,<x [0 otherwise. SIMPLE STOCHASTIC NETWORKS 443 Letting 8 (j) refer to theyth realization (j—1,2, . . ., n) , the estimate of the desired probability is (2.10) g= «■■>+«»>+■ . ■+«'-» ft It is simple to see that (2.11) £[§]=*'• and _ x k (l — x k ) (2.12) Var [8] = * U * J . The ratio (2.13) Var [g] _jc~ a — 1 £ 2 [S] « measures the fractional or percent error of the estimate. Plainly a large k tends to make the above error measure large, requiring an enormous n to obtain a good determination. An antithetic approach might go as follows. With 8 associate the antithetic indicator 8 ', where (214) g riif U-/?.)<*, (1 -«*)<*, . . ., (1-R k )<x [0 otherwise. Obviously 8 and 8' cannot be unity simultaneously if x < 1/2; for simplicity x < 1/2 will be assumed. Now form the antithetic estimate , oie , - _8< 1 > + 8< 1 >' + S< 2 > + 8< 2 >'+ . . . +8<"> + 8<">' (^•lt>) OA ^ Clearly 8 + 8' is unity with probability 2x'\ Hence (2.16) E [h.,]=^P^ = x>\ Zn so the estimate is unbiassed. Furthermore (2.17) Var [8,] = ^ILZL 2 ^!!. Zn The proportional error of the estimate is (2 18) Var [8,,] = x~ k -2 1 Var [8] E*[8 A ] '2n "2 £2 [s] Of course the above antithetic procedure is merely one possibility. Another is to antitheticize on some, but not all of the links simultaneously. There are in all 2'' such possibilities. 444 J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS B. Stratification The Monte Carlo technique, called stratification, is an extension of the concept underlying antithetic variates, i.e., the creation of parallel, negatively correlated, realizations that are to be averaged. To simplify the exposition, let us initially assume our network contains a single activity whose time-to-completion is denoted by T%. As mentioned above, realizations of T\ are created by taking uniform random draws over the unit interval and transforming them, usually via the inverse distribution function. "A>way stratification" is performed in the following manner. Divide the unit interval (0, 1) underlying realizations of T\ into k disjoint and exhaustive subintervals; as a practical matter, one usually uses subintervals of equal lengths although other choices are available (See [4, p. 56]). Draw a uniform random number, R, and compute k parallel realizations of T\ from the random numbers, R/k (l + R)/k, . . ., (k—l+R)/k. Randomly assign these realizations of T\ to k parallel simulations. Repeat the procedure by drawing a second random number, and so on. The above procedure leads to k parallel sets of realizations of T\, and it can easily be shown that within any one of these sets, realizations of T\ are independent. However, between the parallel sets of 7Vs there may be considerable negative correlation. Thus, if one set has an unusually large number of low realizations of T\ (i.e., lower than E{T\)), some other set(s) will have a correspondingly small number of low realizations of T\. By taking averages across the sets, low variance estimates, of say E{T\), may be obtained. For networks containing many activities, stratification may be performed on some or on all of the activity times and different A;'s may be chosen for different activities. There is considerable "art" involved in selecting which activity times should be stratified and how detailed (choice of k) such stratification should be. Generally, for activity times which have relatively large variance and activities which have a high probability of lying on the critical path, one would use fairly detailed stratification (about 4- or 5-way). C. Control Variates Suppose we are interested in estimating the distribution function, or a parameter thereof, of the completion time of Network I below. It would be extremely arduous to compute the completion time distribution of that network by analytic methods. The control variate procedure requires that we construct a simplified network "similar" to Network I, such as Network II (see Figure 3). NETWORK I NETWORK H Figure 3. Under certain assumptions (see Section 4) concerning the link time distributions, Network II may be analyzed by exact methods. The general idea of the control variates technique is to exploit the similarity between I and II in order to improve estimates concerning I; in doing so, use is made of the exact knowledge of II. In other words, simulation is used to correct a known result for II to bring it close to I. Again the key to the correction procedure is to reuse the basic random numbers to create SIMPLE STOCHASTIC NETWORKS 445 parallel realizations. However, this time the attempt is made to achieve a high positive correlation between the behaviors of I and II. To clarify these ideas, suppose we are interested in estimating the expected completion time, E[Mi], of Network I. Suppose further, that the distribution functions corresponding to activity times in Network I, except that of T 7 , are amenable to mathematical computations, e.g., exponentials, uni- forms, gammas, etc. We might construct the control Network II as follows: Let Ta = E[T\] +Ti; T B = E[Ti]+T s ; T c = a random variable highly (positively) correlated with T 7 , but possessing a simpler (easier to manipulate mathematically) distribution function than that of T 7 ; let T>, T$, and T$ be as in Network I. We then run parallel simulations of the time to complete each network using the same set of random draws for corresponding link times; i.e., when we draw a random number to compute 7\, we use the resulting value in computing T.\ and 7V Similarly, we use the same random draw in com- puting T 7 and 7V. This then implies that a positive correlation tends to exist between Mi and M//. Notice that since (2.20) E[M I ]=E[M„]+E\_M I -Mu], it is possible to estimate E[Mi] by computing E[Mn] analytically, and then estimating E[M, — Mn] from a sampling experiment. The latter step involves obtaining n realizations: Mj'^ and Mjp (t=l, 2, . . ., n) and averaging: 1 A (2.21) M l -M„ = ± '2(M}»-M}j Since the link times shared the same random numbers, M\ n and M\'/ are positively correlated. Thus (2.22) Var [M,-M„] =~ { Var [Mi] + Var [M„] ~ 2 cov [M,, M„] } . n Hence, it follows that if (2.23) Var [M„] < 2 cov [M,, M„], a reduction in variance over a straight simulation has been achieved. Of course the degree of the reduction depends upon the skill with which the approximating network, II, is selected and analyzed. Usually there is some freedom for choice of II and the latter will have many fewer links than I. One method for forming the control variate is to use a subnetwork of the original complex network. The subnetwork can be formed by deleting those activities that have low probabilities of being on the critical path. Certain natural extensions of this method suggest themselves: (a) As described, the control variables method applies straightforward Monte Carlo to estimate the correction to E[Mu], i.e., the second term of (2.20). Sometimes this estimate can be further im- proved by supplementary use of antithetic or stratification techniques; see Example 4 in section 3. 446 J- M - BURT, Jr., D. P. GAVER, AND M. PERLAS (b) There is no reason to restrict attention to one approximating network. Suppose another III is introduced. Then (2.24) E[M,] =w 1 E[M„] +w 2 E[M,„]+ Wl E[M, - M„] + w 2 E[M, - M m ], where the weights w\ and w-ziwi + w 2 = 1) may either be chosen equal to one-half, or, perhaps better, estimated by making use of the regression technique, see Hammersley and Handscomb [4, p. 66]. In the latter we seek W\, w-i to minimize (2.25) Var OiAi + ma>A 2 ] = w\ Var A, + 2mw 2 cov (A,, A 2 ) + w?g Var [A 2 ], where W\ + w> = 1 and Ai = Mi — Mn, A 2 — Mi — Mm- Although the latter variances and covariances are unknown, they may be estimated from the sequences of individual realizations {M ( p — M { })} and {A/'/' — M\'^}. It seems possible that bias introduced by such a procedure for finding weights may be effectively reduced by application of the Tukey-Quenouille jackknife procedure. Finally, there is no necessity for limiting the analysis to two approximating networks. It should be mentioned that one very simple — and traditional — control variate derives from the original PERT procedure of finding that path through the network the sum of whose expected link time is longest. Then all other paths are ignored and the distribution of network completion time is taken to be normally distributed with mean equal to the sum of the means of the activities on the above "critical path," and with variance equal to the sum of the variances. MacCrimmon and Ryavec [8] and Van Slyke [11], as well as others, have examined the bias of such an approximation; the control and antithetic variables procedures advanced here should be capable of considerable improvements over this PERT approximation. (c) The various techniques mentioned have emphasized the determination of the mean or expected time to complete a PERT network; however, they apply equally well to the estimation of the expecta- tions of other functions. One such is / o^ ( 1 for M 3= x (2.26) <P(M) = I for M < x; the expectation of the latter is just the probability that M < x, which may be of use. In the event that x is rather large (or small), then application of the method of importance sampling may be of use in improving precision. The success of the three Monte Carlo techniques discussed above will vary from network to net- work. Indeed, even on the same network, the success of the techniques will vary from experiment to experiment due to randomness of the draws generated for an experiment. In the section to follow several examples of these techniques are illustrated nuiherically so that the reader may get a feel for the improvements that are "typical." In the process, control networks involving exponentially dis- tributed link times are employed. Methods for handling such networks analytically (by Markov property and Laplace transform) and numerically (by numerical transform inversion) are relegated to a later section. 3. EXAMPLES To make a beginning, the following very simple networks, Figure 4, were treated by means of the antithetic and stratification procedures. SIMPLE STOCHASTIC NETWORKS 447 EXAMPLE 1: NETWORK I (WHEATSTONE) -K> NETWORK 2 (SERIES) -*C^-K> NETWORK 3 (SERIES -PARALLEL) ♦O T 2 T 4 T 6 T, Figure 4. The distributions for link times (7 1 ,) are identical and independent exponentials (3.1) P{Ti>x}= e-^ 10 forx^0 = 1 for x < 0. All of these examples may be analyzed analytically, and the actual expected completion times and variances, etc. have been calculated. As a check of the methods suggested, antithetic and three-way stratified sampling was applied, as in section 2, A and B, to estimate the expected completion time. The results concerning variances reduction are summarized in Tables 1 and 2. / T+T'\ Table 1. Antithetic Variates: Variance Reduction [T* = antithetic realization, — - — I Network (1) No. of realiza- tions, A' (2) \£r(r) N (3) Var(7") N (4) Average of columns 3 &4 (5) (6) v2r(r*) N (7) Xt (8) 1 1' 1 1 50 100 200 400 5.341 2.830 1.526 0.760 5.986 2.650 1.443 0.691 5.664 2.740 1.485 0.726 5.774 2.887 1.443 0.722 1.849 0.852 0.428 0.227 3.12:1 3.39:1 3.37:1 3.18:1 2 2 2 2 50 100 200 400 10.110 5.133 •2.375 1.233 10.724 4.781 2.531 1.205 10.417 4.957 2.453 1.219 10.000 5.000 2.500 1.250 1.814 0.905 0.444 0.225 5.51:1 5.53:1 5.63:1 5.56:1 3 3 3 3 50 100 200 400 10.486 4.685 2.601 1.319 9.537 5.553 2.388 1.243 10.011 5.119 2.495 1.281 10.000 5.000 2.500 1.250 3.644 1.855 0.829 0.422 2.74:1 2.70:1 3.02:1 2.96:1 „ . „ . Var(D Var(r) Column A i is rr- — — '(computed analytically) A Y Column X-i is the ratio describing variance reduction; X\ divided by v7r(T* 448 J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS TABLE 2. Stratification: Variance Reduction ion(r** = stratified realization, T+T' + T' Network (1) No. of realiza- tions, N (2) Var(D N (3) Var(7") N (4) Var(r') N (5) Average of columns 3, 4, & 5 (6) (7) Var(7'**) N (8) X-, (9) 1 1 1 1 50 100 200 400 6.368 2.846 1.464 0.758 5.869 3.091 1.595 0.704 5.761 2.939 1.355 0.693 5.999 2.959 1.471 0.718 5.774 2.887 1.443 0.722 1.171 0.631 0.319 0.153 4.93:1 4.58:1 4.52:1 4.72:1 2 2 2 2 50 100 200 400 9.565 5.090 2.479 1.235 9.831 4.713 2.411 1.243 10.235 5.064 2.457 1.203 9.887 4.956 2.449 1.227 10.000 5.000 2.500 1.250 1.946 0.934 0.454 0.229 5.14:1 5.35:1 5.51:1 5.46:1 3 3 3 3 50 100 200 400 9.942 5.037 2.754 1.378 10.737 5.569 2.432 1.225 10.001 5.561 2.542 1.272 10.226 5.389 2.576 1.292 10.000 5.000 2.500 1.250 2.422 1.258 0.592 0.311 4.13:1 3.98:1 4.22:1 4.02:1 Column X\ is Var(D Var(7") Var(7"') N N N Column X-2 is the ratio of variance reduction; X\ divided by Var^**) N As can be seen, the variance reduction obtained by utilizing the antithetic or stratification procedure is greater than that from doubling the sample size and sampling independently. We remark, in passing, that the estimate of completion time variance obtained by simple averaging of the variance estimates obtained from the antithetic experiments, i.e., the estimate in Column 5 of Table 1, is likely to be biassed. Agreement with the calculated values seems, however, to be quite satisfactory for the present example. In the second example control variates are applied to Network 1 (Wheatstone) above. EXAMPLE 2: Obvious possible control networks to use in connection with Network 1, as shown in Figure 5, are: control i * CONTROL 2 * Figure 5. *The random variables, T t , in the control networks are identical to those in the original Network 1. SIMPLE STOCHASTIC NETWORKS 449 The completion time for Network 1 may be written as (3.2) r=max (T, + T>, T, + T, + T 5 , T 4 + T 5 ), while that for Controls 1 and 2 are, respectively, (3.3) C(l) = T l + T 3 + T 5 C(2) = T 1 + max(T-,, T :i + n) Again the true values may be computed: E[T] =34.58 (3.4) £[C(1)] = 30 £[C(2)] = 32.5 Table 3 illustrates the effect of the technique. In the table, the control estimate is (3.5) fcu)=T+(E[C(i)]-C(i)), 1=1,2. Here the mean-squared error (m.s.e.) reduction is tabulated in addition to the variance reduction; the m.s.e. is estimated by averaging the squared difference between the control estimate and E[T~\. Var [T] = 289 Var [C(l)]=300 Var [C(2)]=294. TABLE 3. Effects on Variances Number of r« :alizations, N 50 100 200 400 f 37.74 33.84 34.01 34.84 Var (T)IN 6.23 2.35 1.47 0.75 C(l) 27.31 29.32 29.13 30.34 T<(t) 35.05 34.02 34.88 34.50 Var (7V >)//V 1.97 0.72 0.41 0.36 Var. Red. 3.16:1 3.27:1 3.59:1 2.88:1 M.S.E. Red. 3.86:1 4.03:1 2.98:1 3.04:1 C(2) 34.84 31.35 32.04 32.70 Tcm 35.40 34.99 34.47 34.64 Var (Tcm)IN 0.82 0.46 0.25 0.15 Var. Red. 7.59:1 5.11:1 5.88:1 5.00:1 M.S.E. red. 5.09:1 4.17:1 4.93:1 4.03:1 The reader will observe that the variance and squared error improvements obtained with the first control variate are roughly the same as that achieved with the antithetic variate and stratification procedures; however, the second control variate yielded improvements almost twice as large as those of the other Monte Carlo techniques. 450 J- M. BURT, Jr., D. P. GAVER, AND M. PERLAS EXAMPLE 3: This example refers to the network depicted in Figure 6. Figure 6. It will be of interest to apply some of our techniques to this problem, since it exhibits greater complexity than the examples previously treated. Obviously, complication is introduced by such "crossing" links as link 10, having completion time Tio. If all link times have the same mean (assumed here) it seems natural to use the partial network of Figure 7 as a control. Of course, other possibilities (e.g., that involving links 1, 5, 10, 7, 8, and 9) are of interest as well. It turns out that the control network of Figure 7 can be completely analyzed numerically (see the next section) if link times are independently and exponentially distributed. Supposing that E[Ti] = 10, then, we find that (3.6) E[Vft\ = 47.5, Var[7™] = 418.8, where T ( ['l denotes the time to complete the control network of Figure 7. Figure 7. SIMPLE STOCHASTIC NETWORKS 451 Table 4 summarizes our sampling results when the number of repetitions is 25. SAMPLING RESULTS TABLE 4. Number of Repetitions: iV = 25 Straight Antithetic Antithetic estimate Control estimate Control- regression Means Full Network {Tab): 58.09 50.95 54.5 57.4 57.5 Control Network(TW): 48.2 42.7 45.5 Variances Full Network: 790 373 208 314 298 Control Network: 689 267 111 The antithetic and control procedures adopted here reduce the straightforward variance of the esti- mate by one-half to one-third. The above experiment was repeated with iV = 50 independent repetitions. The relative variance reductions were found to be quite comparable to those for TV = 25 (see Table 5). SAMPLING RESULTS TABLE 5. Number of Repetitions: N = 50 Straight Antithetic Antithetic estimate Control estimate Control- regression Means Full Network 54.6 54.2 54.4 54.2 54.3 Control Network 47.9 46.8 47.4 Variances Full Network 328 395 105 110 91 Control Network 390 382 116 EXAMPLE 4: Again the network considered is that of Figure 6. The experiments carried out are related, but somewhat more extensive, than those of the previous example. We have completed inde- pendent samples of A/=25 and A/=50 (10 of each). From each sample (25 and 50) we computed the various estimates obtained in the last experiment, and estimates of the variances of those estimates. Two additions were made to the previous experiment. First, an alternative control network, de- picted in Figure 8, is utilized for comparison with that of Figure 7. 452 J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS Figure 8. The control networks of Figures 7 and 8 will be called the Upper Control Network and the Lower Control Network, respectively. Second, a combined antithetic-control estimate is added to the list of those considered. In brief, the antithetic estimators for Tab and T A C ^ are constructed; these are (3.7) Tab + Tab j tab = 7, , and 7KC)_l. fie)' The same random numbers that appear in T A c d appear also in Tab as was true earlier; their antithetic versions appear in Tab and T\ C J' . An unbiassed control estimate of E\Tab\ may be created using the variables (3.7): (3.8) E[TAB]A,C = EWJ) + [TAB-f%>] = E[T%)] + [tab-t a W. (E[Tab]a,c denotes the combined antithetic-control estimate of E[Tab] )■ Finally, a regression adjusted estimate of the form (3.9) E[tab]a,c,h = -Tab + P(t^-E[T a ^]) is considered, where the constant /3 is so selected as to approximately minimize the mean-squared error of the estimate; see (2.24). (E[T A b]a,c,k denotes the regression adjusted antithetic-control estimate o(E[Tab]). The results of the simulation experiments are presented in Tables 6 and 7. Examination of the tables reveals the effects of the several variance-reducing maneuvers employed. Roughly speaking, a halfing of the variance of the estimate results when either antithetic or control methods alone are used. Present evidence indicates that the Upper Control Network is the more effective as a control variate. If both the antithetic and the control devices are used simultaneously, as in (3.8) the variance is halfed again. A word about the regression-adjusted estimates is in order. It may be shown that the parameter /3 in (3.9) is best taken to be j8« = -Cov \r A c B \ tab] Var[r<f s >] SIMPLE STOCHASTIC NETWORKS 453 SIMULATION RESULTS Table 6. Means: A/= 25 Runs Straight Control Control- regression Antithetic Antithetic- control Regression- adjusted antithetic- control Lower Control Network 1 2 3 4 5 6 7 8 9 10 50.1 56.9 53.9 49.6 54.3 55.1 54.2 54.1 57.6 56.8 50.5 54.6 56.0 52.7 54.3 52.3 53.5 51.8 52.9 56.1 51.2 55.6 55.4 52.5 54.3 52.3 53.5 52.1 54.8 56.4 54.5 54.5 55.0 52.3 51.2 53.4 54.6 56.8 51.8 57.3 51.4 56.4 55.6 53.8 54.1 54.3 54.7 54.1 53.0 56.6 51.4 57.1 55.5 54.0 53.6 54.0 54.7 54.5 52.9 56.7 Average 55.1 53.5 53.8 54.1 54.4 54.4 Var. of estimate 6.2 3.2 3.1 4.0 2.4 2.7 S.D. of estimate 2.5 1.8 1.8 2.0 1.56 1.63 Upper Control Network 1 2 3 4 5 6 7 8 9 10 50.1 56.9 53.9 49.6 54.3 55.1 54.2 54.1 57.6 56.8 57.4 55.1 51.7 53.9 52.1 54.7 55.3 54.8 50.5 55.8 57.5 55.9 52.0 52.7 52.4 54.7 54.9 54.7 52.0 56.2 54.5 54.5 55.0 52.3 51.2 53.4 54.6 56.8 51.8 57.3 56.5 53.7 54.2 53.4 52.5 53.3 55.0 55.9 53.3 56.3 56.5 53.6 54.2 53.4 52.3 53.3 55.0 55.9 53.0 56.4 Average 55.1 54.1 54.3 54.1 54.3 54.4 Var. of estimate 6.2 4.4 3.8 4.0 2.0 2.3 S.D. of estimate 2.5 2.1 1.9 2.0 1.4 1.5 454 J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS SIMULATION RESULTS Table 7. Means: N = 50 Regression- Runs Straight Control Control- regression Antithetic Antithetic- control adjusted antithetic- control Lower Control Network 1 54.6 56.0 55.6 54.4 54.1 54.2 2 55.5 56.5 56.2 57.5 54.9 55.6 3 54.7 55.2 55.0 55.5 54.7 54.8 4 56.1 58.6 57.8 55.2 56.8 56.8 5 51.1 51.2 51.2 51.9 54.2 53.6 6 50.5 56.7 54.6 53.3 54.9 54.3 7 58.7 55.4 56.2 55.1 54.6 54.8 8 52.8 52.3 52.3 56.2 53.4 53.9 9 58.9 54.2 55.6 57.3 55.6 55.9 10 54.2 52.5 52.9 52.7 52.9 52.9 Averages 54.7 54.9 54.7 54.9 54.6 54.7 Var. of estimate 7.9 5.3 4.1 3.5 1.2 1.22 S.D. of estimate 2.8 2.3 2.0 1.9 1.09 1.10 Upper Control Network 1 54.6 54.2 54.3 54.4 54.6 54.5 2 55.5 54.8 55.0 57.5 56.1 56.5 3 54.7 52.7 52.9 55.5 54.6 54.8 4 56.1 54.4 54.7 55.2 53.8 53.8 5 51.1 55.9 54.6 51.9 54.4 53.8 6 50.5 52.3 51.9 53.3 54.4 54.1 7 58.7 53.1 54.0 55.1 53.6 53.7 8 52.8 56.1 55.4 56.2 57.2 56.8 9 58.9 56.7 57.3 57.3 56.3 56.4 10 54.2 54.1 54.2 52.7 54.8 54.4 Averages 54.7 54.4 54.4 54.9 55.0 54.9 Var. of estimate 7.9 2.2 2.1 3.5 1.4 1.5 S.D. of estimate 2.8 1.5 1.4 1.9 1.2 1.2 SIMPLE STOCHASTIC NETWORKS 455 and it is obvious that the covariance term must be estimated from data; see [4, p. 67]. In the experi- ments, a /3-estimate was formed for each sample of 25 or 50 and used to create the Control-Regression and Regression Adjusted Antithetic-Control figures. 4. ANALYTICAL METHODS BASED ON EXPONENTIAL DISTRIBUTIONS The assumption that exponential distributions govern fink times in networks often results in considerable mathematical simplicity. Of course the link times encountered in real problems are apt not to be of exponential character. Their densities are usually expected to exhibit a mode (peak) at some positive time value, and perhaps to have a vaguely normal or Gaussian appearance. Two remarks may be made: i) If empirical observation indeed suggests a non-exponential form of the type described above, combinations (e.g., convolutions) of exponentials may be used as approximations. ii) Certain types of operations may be expected, theoretically, to give rise to near-exponential distributions. These are of a quasi-Sisyphean character: link traversal terminates when a certain task is successfully completed, but if success does not occur the task must be repeated; success occurs on any attempt with small independent probability, p. Certain computer program debugging, or other trouble shooting, processes may be of this nature. Construction completion times might have a similar character if weather conditions intervene and necessitate repetitions. In another context, the time re- quired to pass through a part of a transportation route on which one may encounter randomly placed congestion points, each of which offers independently and similarly distributed delay. If the number of congestion points has a geometric distribution of a reasonably large mean, the time required has exponential properties. Thus the exponential may be worth consideration, and we shall use it as a basis for discussion here. These simple properties of independently and exponentially distributed random variables are well known or easily derived: (a) that sums of identically distributed exponential variables are gamma distributed, (b) that sums of differently distributed exponential variables are a linear (nonconvex) combination of exponentials, and (c) that the maximum of n exponential random variables is distributed as a product of the respective distributions. Properties (a) and (b) bear on the modelling of simple series networks, while (c) relates to finks in parallel. Sometimes it is possible to build up complex network distributions from simpler ones as follows. Figure 9. The dotted path, in Figure 9, represents a complex network with Tv as completion time, while T L is a single exponential path in parallel. Then (4.1) P{Tab ^ x) =(l-e"'«) U(x), where U is the d.f. of T u . If we now introduce the Laplace transform E[e~ sT --m] we find that 456 J. M. BURT, Jr.. D. P. GAVER, AND M. PERLAS '[e- sT AB] = (' Jo (4.2) e- sx (\-e-> lx )dU(x)+ e- sx U{x)e-' J - ,c ixdx. _±L = U(s)-U(s+^)+ j^U(s + ijl), U (s) being the Laplace-Stieltjes transform of U. This procedure can obviously be repeated to generate further exponential parallel links, with the last transform generated serving as U(s) at the next stage. Consider SPECIAL CASE A: (4.3) Then immediately (4.4) U(x) = \-e~ kx x^O = x < 0. E[e- sT AH] = A | £*_ k + S k + fJL + S S + /X k + fJL+S k /A k + fX k + S /X + S k + /JL + S~ If k = (x the above becomes «'^-(l&j)( k + s Thus T ah is the sum of an exponential with mean (2X) - ' and another with mean A -1 ; this follows directly from the Markovian (memoryless) property of the exponential density. A useful generalization of the above situation occurs when 77 has the distribution (4.5) P{T,.^x}=l-^ C je-»j x . Then basic transform properties provide that (4.6) ]E[e-TAB] = jj l-]£ Cje-W U(x)dx= U(s) -^ cjU(s + fij): here the Laplace transform of the distribution U is (4.7) v(s)= r e~ sx u(x)dx=- r Jo 5 Jo e-'*dU(x)=-U(s). s Consequently, (4.8) E[e- sT AH] = U(s) -s f — CJ — U(s + fjLj) SIMPLE STOCHASTIC NETWORKS 457 SPECIAL CASE B: Suppose 77. is distributed as a convex combination (probability mixture) of different exponentials; i.e., with probability c } the lower link time is exponential with parameter fXj. Transform (4.8) then represent this situation. Repetition allows many parallel paths to be added. SPECIAL CASE C: Now take 77 to be distributed as a sum of two dissimilar exponentials: (4.9) Ele-fq =- J ~ ■ fJ-2 fJL\+ S /JL> + S U2 Ml Mi fl2 (X> — /Mi fJLi+S /JL> — Ml /U-2 + 5 Thus 77. has the distribution (4.5) with Ci = /.iii/xi — fit)~\ and c 2 = — Mi(m^ — M-i) '• More dissimilar components may be handled by applying a partial fractions decomposition to the transform of 77. SPECIAL CASE D: Consider the problem of Example C, but put fx> = /m + 8, later allowing 8— » 0. 77. thus has a gamma distribution. We find that (4.10) and when 8— > (4.11) E[e-* T *«] = U{s) -j— £/(s + Ml )+f 5 ~r fJL\ O Ujs + fii + S) U(s + fii) s + fjii + 8 s + m-i E[e- sT AH] = U(s) -f— Uis + ^+s-jz s + fA\ at, uu) e-8 + m (4.12) The formula (4.11) can be derived directly. Suppose 77. has the gamma distribution (ax) k ~ l P{T L ^x}=l-e-°*(l+^+^+ (k-l)l Then by standard transform properties (4.13) -E[e~* T AB ]= f" 5 Jo or (4.14) ax A = 1 (— a) j d.J ~ j = /! # J £=s + a '>•-• (—a)-' JJ £[ e -^]=f/( 5 )- 5 2 i -r^^7 (a*)*- 1 — p- ax \ 1 + - h 1! (A-DI/J U{x)dx. J = U(0 the zero-order derivative is taken to be the function itself. Clearly (4.11) and (4.14) agree when k = 2 and a = fii. SPECIAL CASE E: Refer to Example 3 of the privious section; the subnetwork of Figure 7 may be studied directly by means of (4.14). First, put (4.15) U(s) = E[e- s ^ T < ) ]- t* /JL + S 458 J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS Then according to (4.14) the transform associated with the completion time of the parallel network involving link times Tz, T3, T 5 , and T 6 is E[e~ sT ]- (4.16) - t x M fi y s 1 fx \ d A* d€l?\n+€ l = M + s P- + S/ 5 + IX\2/X + S — 5 ^ + 2/Ll 3 (ix + s) 2 (2fx + s) 2 (fJL + S )(2fX + s) 3 It is now only necessary to multiply this expression by the transform of the sum of links involving T\ and r 4 , namely I — ^7— ) , in order to obtain E[e~ sT AB] for the subnetwork of Figure 7. The result may be simplified to (4.17) /x + s E[e- sT An] = 2 M (X + S M IX fJL + S/ \2/JL + S + 2 M ^ /i + s/ \2fi + s) Differentiation then produces the mean, variance, and other desired moments. Although it is appar- ently possible to invert transforms mathematically the results will typically appear as incomprehensible combinations of exponentials. We favor the numerical inversion of such transforms, using perhaps the method of Abate and Dubner [1]. For given link time distributions one can thus quickly develop a table of the distribution function of Tab for the approximating network. REFERENCES [1] Abate, J. and H. Dubner, "Numerical Inversion of Laplace Transforms by Relating Them to the Finite Fourier Transform," J. of the ACM, Vol. 15, No. 1 (Jan. 1968). [2] Charnes, A., W. W. Cooper, and G. L. Thompson, "Critical Path Analyses Via Chance Con- strained and Stochastic Programming," Operations Research 12, 460-470 (1964). [3] Feller, W., An Introduction to Probability Theory and Its Applications (John Wiley and Sons, New York, 1966), Vol. II. [4] Hammersley, J. and D. C. Handscomb, Monte Carlo Methods (Methuen and Co., Ltd., London, 1967). [5] Hartley, H. O. and A. W. Wortham, "A Statistical Theory for PERT Critical Path Analysis," Management Science, 12, 469-481 (June 1966). [6] Jewell, W., "Risk Taking in Critical Path Analysis," Management Science, 11, 438-443 (Jan. 1965). [7] Kelley, J. E., "Critical Path Planning and Scheduling; Mathematical Basis," Operations Re- search, 9, 296-320 (1961). [8] MacCrimmon, K. R. and C. A. Ryavec, "An Analytical Study of the PERT Assumption," Opera- tions Research, 12, 16-37 (1964). [9] Martin, J. J., "Distribution of the Time Through a Directed Acyclic Network," Operations Research, 13, 46-66 (1965). [10] Moder, J. and C. Phillips, Project Management with CPM and PERT (Reinhold Pub. Co., New York, 1964). SIMPLE STOCHASTIC NETWORKS 459 [11] Van Slyke, R. M., "Monte Carlo Methods and the PERT Problem," Operations Research, 11, 839-860 (1963). [12] Wiest, J. D., "Some Properties of Schedules for Large Projects with Limited Resources," Opera- tions Research, 12, 395-418 (1964). A NETWORK ISOLATION ALGORITHM* M. Bellmore The Johns Hopkins University G. Bennington North Carolina State University^ and S. Lubore The MITRE Corporation ABSTRACT A set of edges D called an isolation set, is said to isolate a set of nodes R from an un- directed network if every chain between the nodes in R contains at least one edge from the set D. Associated with each edge of the network is a positive cost. The isolation problem is concerned with finding an isolation set such that the sum of its edge costs is a minimum. This paper formulates the problem of determining the minimal cost isolation as a 0-1 integer linear programming problem. An algorithm is presented which applies a branch and bound enumerative scheme to a decomposed linear program whose dual subproblems are minimal cost network flow problems. Computational results are given. The problem is also formulated as a special quadratic assignment problem and an algo- rithm is presented that finds a local optimal solution. This local solution is used for an initial bound. INTRODUCTION Given an undirected connected network G = [N; E] with r sets of distinguished nodes, Rj C TV, for 7= 1, 2, . . ., r; a set of edges D C E, called an isolation set (or isolation) has the property that the removal of the edges in D partitions the graph into p 5= r connected components G\, G-z, . . ., G,,. In addition, for the set D to be an isolation with respect to the r distinguished node sets, the p 3= r connected components must be capable of being grouped into exactly r sets, Pj for j= 1, . . . ,r, such that each set Pj contains all of the distinguished nodes in Rj. Associated with each edge (x, y)eE of the graph is a positive cost of removal, c(x, y) > 0. A minimal cost isolation, Do, is an isolation set, the sum of whose edge costs is a minimum. One example of an isolation problem is that faced by an antagonist who desires to attack a com- munications network in a manner which prevents specific sets of nodes Rj from communicating with other sets of nodes /?,-, i ¥>j. If a cost is prescribed for the attack of each fink in the network (it is as- sumed that the cost of attacking nodes is prohibitively high), then the attacker wishes to find a minimal *This paper was presented at the 35th National ORSA meeting in Denver, Colorado, on June 17, 1969. tThis work was performed while the author was at The MITRE Corporation. 461 462 M. BELLMORE, G. BENNINGTON, AND S. LUBORE cost isolation. Problems of a similar type have been investigated and solved by Greenberg [5] and Jarvis [6] in a different fashion than the algorithm presented in this paper. A variant of the backboard wiring problem [9] can also be formulated as an isolation problem: consider an existing system composed of subsystems. New components are to be added to the existing system. Each component to be added requires additional wires to be connected to each subsystem and to each other component. Let c(x, y) be the number of wires to be added between x and y. Assum- ing sufficient space in each subsystem to accommodate any or all of the additional components, then one desires to place the new components in the existing subsystems in a manner that minimizes the number of additional wires between the subsystems. The set Rj corresponds to the 7th subsystem. Bennington [1] has related the problem of determining the minimal multicommodity disconnecting set in an undirected graph to the isolation problem. Example: Figure 1 shows an undirected graph associated with a simple example of the isolation problem. o C(l,4): I <3,4)=l G> C(2,5) = 2 Figure 1. Example of the isolation problem It is desired to isolate nodes 1, 2, and 3 from each other. The number adjacent to the edges in the graph is the cost to remove the edge. The minimal cost isolation has a value of 3 corresponding to removal of the edges {(1,4), (2,5)} or {(1,4), (3,4), (3,5)}- THE ISOLATION PROBLEM AS A 0-1 INTEGER LINEAR PROGRAM Suppose that an isolation has been found for the network G= [N; E] then each node xeN is con- tained in exactly one of the partitions Pj, j= 1, . . ., r. In addition each Rj C Pj, Let 1, xePj, ^ { * ) = lo', xePt, l t): Each undirected edge of the network is assigned an arbitrary orientation, that is, each edge is defined by one of the ordered pairs (x, y) or (y, x) corresponding to the endpoints (nodes) of the edge. The isolation problem is stated as: Minimize subject to (1) Z= Y c(x,y)\y j [y j (x,y) + d j (x,y)] (x7y)eE \j=l y ttj(x)=1, xeN-URj, NETWORTH ISOLATION ALGORITHM 453 1 , xeRj , (2) irj(*) = [ 0, xeRi, i ¥=j, j=l, . . ., r, (3) 7T j (x)-TT J {y) + y j (x, y)-8j(x, y) = 0, (x,y)eE, (4) TTjix), Jj(x,y), 8j(x, y) are 0—1 variables. Equations (2) fix each of the distinguished nodes xeRj into a single subset Pj. If nodes x and y are in the same set, say P s , corresponding to a feasible isolation, then v s {x) — TTs(y) = 1 and from Eqs. (1) and (2),ttj(x) —ifj(y) = 0,j^ s. Further, jj(x, y) — 8j(x,y) =0,j=l, . . ., r is feasible to Eqs. (3) and since, c(x, y) > 0, minimizes Z for this isolation. If nodes x and y are in different sets, say P s and P t , respectively, corresponding to a feasible isola- tion, then TT s (x) = TT t (y)= 1; and from Eqs. (1) and (2) ttj(x) — 0, j ^ s; and 77j(y) = 0, j¥* t. Further, in order to satisfy Eqs. (3) y s (*, y) — 0, 8 s {x, y)=l; and jt(x, y)= 1, 8 t (x, y) = 0. Since c(x, y) > 0, jj{x, y) = 8j(x, y) = 0,j¥=s, t. From the above arguments, an edge (x, y) is contained in an isolation if either jj(x, y)=l or 8j(x, y) = l for some index j. Further, since for each 8 s (x, y)= 1 there exists a 8 t (x, y)= 1 and both contribute to the value of the objective function, 1/2 Z is the value of a minimal cost isolation. The proposed algorithm solves the isolation problem using the Land and Doig [7] branch and bound enumerative scheme with restrictions (4) replaced by nonnegativity constraints (i.e., as a series of linear programs). Each of the linear programs is decomposed using the Dantzig-Wolfe technique [3]. Equations (1) which link the r subproblems together correspond to the master problem. Theyth subproblem is: Maximize W= ^ <t(x)ttj{x) - ^ c(x, y)[jj{x, y) + 8j(x, y)] , ( x, y)(E subject to: (2) (2A) (3) TTj{x)-TTj(y) + jj{x, y)-8j(x, y)=0, {x,y)eE, (4A) ttj(x), yAx,y), 8j(x,y)>-0, where (t{x) is the simplex multiplier corresponding to each constraint (1) in the master problem for the free nodes (i.e., x^Rj, i=l, 2, . . ., r) , and cr(x) = for the fixed nodes (i.e., xe u Ri) Constraints (2A) have been included in order to insure that the subproblems are bounded. Unless the subproblem is bounded, no feasible dual solution can be found. Since the solution technique to 464 M. BELLMORE, G. BENNINGTON, AND S. LUBORE be employed requires the dual solution to these subproblems, this restriction is important. An alternative formulation of the 7th subproblem that is easier to deal with is given by (the sub- script j has been dropped from all of the variables to alleviate the notational burden): Minimize W —— ^ (t(x)tt(x) + ^ c(x,y) [y(x, y) + d(x, y) ] , subject to f(x, y): tt{x) — 7r(y) +y(x, y) — 8(x, y) = 0, (x,y)eE, r =-l,xeRj, p(x): -tt{x) {= 0, xeRi, i^j, l,xeN-U Ri, tt(x), y(x, y), 8(x, y) 3=0. The simplex multipliers associated with the dual of this problem are given to the left of their cor- responding equations. For notational convenience, let f{x, W) = V f{x, y) and let f(N, x) = V f{y, x) . yeN ytN The dual of this problem is Maximize T=^ b(x)p(x), xtN ,,'tfxeRi, i^j, where (r otherwise, subject to tt(x): f(x,N)-f(N,x)-p(x)^-a-(x), xtN, y(x,y): f(x, y) ^ c(x, y),) 8(x,y): —f(x,y)^c(x,y),\ f(x, y) unrestricted, (x, y)eE, p(x) 0, if*e7V- U R h 1 = 1 unrestricted, if xe U /?,-. As before, the dual variables are given to the left of the constraint equations. This problem resembles a single commodity flow problem with the exception of the/(;t, N) —f(N,x) —p(x) =£ — a(x) equations and the p(x) variables. This problem can be transformed into a more recognizable form. Upon adding the nonnegative slack variables, q{x) , to the/(x, N) —f(N, x) — p(x) =£ — cr(x) equations, the problem becomes (5) Maximize T= ^ b(x)p(x), xtN subject to NETWORTH ISOLATION ALGORITHM 465 (6) (7) (8) (9) f(x, N) -f(N, x) -p(x) +q(x)+ o-(x) = 0, xeN, — c(x, y) =£/(*, y) «=c(x, y), (x,y)eE, ^0, if xeN- U Ri, P(x)l '~ l r unrestricted, if xe U /?,-, i= i q(x) >0, xeN. Consider the redundant constraint formed by summing the negative of Eq. (6): -^ f(x, N) +Z f(N, x) + "£ p(x) -^ q(x) -^ a(x) =0. X€N xc\ Noting that ^f(x, N) = V f(N, x), the redundant constraint can be simplified to x*N xeN (10) X [p(x)-q(x)-a(x)]=0. XfN The constraint matrix formed by including this additional constraint has exactly one + 1 and one — 1 in each column. Therefore, this problem has a representation as a network flow problem. The directed network corresponding to the dual linear program of the jth subproblem will be denoted by G= [N 1 ; A']. The set of nodes N' is formed from the original set N by adding a dummy node 5'. The set of directed arcs A' is formed from the original set of edges E by replacing each edge with two di- rected and oppositely oriented arcs and by adding additional arcs to correspond to the variables q(x), p{x), and constant cr(x). Since the nonnegative variable q(x) has a coefficient of + 1 in Eq. (6) and a coefficient of — 1 in Eq. (10), an arc (x, s') is added to A' with an arc flow variable q(x, s') to replace q(x). In a similar manner, cr(x) is replaced by a constant arc flow variable cr(x, s') or or (5', x), de- pending on the sign of the simplex multiplier. The variable p(x) is replaced by an arc flow variable p (s' , x) if p (x) 2= and by the arc flow variables p (s' , x) and p (x, s' ) if p (x) is unrestricted in sign. The characteristics of the arcs in A' are summarized in Table I. TABLE I. Maximum Cost Flow Problem (Subproblem j) Arc Lower bound Upper bound Cost Comments /(*. y) -c(x, y) c(x, y) Replace with two directed arcs <r(x, s' ) <t(s', x) o-(x) -o-U) a(x) -a-(x) Ifo-U) &0 lfo-(x)<0 q{x, s') 00 p(s\ x) 00 00 -1 otherwise p(x, s') 00 00 1 xeHj .ve/\\, / 9&J 466 M. BELLMORE, G. BENNINGTON, AND S. LUBORE Figure 2 contains a portion of a graph that will be used to illustrate the form of the subproblems. c(l,2)=l C(l,3)=l c(2,3) = 3 FIGURE 2. Subproblem network Assume that in a subproblem 7r(l) «£ 1, 7r(2)=0, and 7r(3) = l; and the simplex multipliers are cri = l, cr 2 = 2, and 0-3 = — 1. The network G= [N';A'] corresponding to this subproblem is shown in Figure 3. Note that for this particular example some of the parallel arcs can be coalesced with others or deleted completely. P(*,i) (0,«>,. I) f(3,l) (0,1,0) q(3,s) (O.oo.O) The first term adjacent to each arc denotes the type of flow variable and the second term is the lower bound, upper bound, and cost of the flow on that arc. Figure 3. Modified graph for subproblem NETWORTH ISOLATION ALGORITHM 467 Fulkerson's Out-of-Kilter Algorithm [4] may be used to solve the subproblems. Although the Out-of-Kilter Algorithm is normally used to solve minimal cost flow problems, in order to apply the Fulkerson Algorithm the flow problem (primal linear program) must maximize the negative of the costs. Hence, since the dual variables obtained from the Out-of-Kilter Algorithm are required to solve the /th subproblem, the flow problem has been cast as one of determining a maximal cost flow solution. In this manner the columns to enter the basis of the master problem are iteratively generated until an optimum solution is obtained. A SPECIAL QUADRATIC ASSIGNMENT PROBLEM The isolation problem can also be formulated as a special case of the quadratic assignment problem. (See Lawler [8] for a discussion of the general quadratic assignment problem.) Minimize Z=l/2j £ £ c(x, y)7Tj(x)[l -TTj(y)], j=l xeN yeN subject to J ttj(x) = 1, xeN-U Ri, i=l 1, xeRj, [0, xeRt, i^j, ttj(x)=0, 1, xeN, t 7=1,- • -,r. Since XSS c(x,y)7T j (x)[l-7r j (y)] = ^ £ c(*.y)"2 £ £ c(x, y)nj{x)nj(y), j=l xeN yeN xeN ytN j=\ xeN yeN and ^ ^ c(a;, y) is a constant, the isolation problem can also be stated as follows: xeN yeN Maximize ^=1/2^^ c(x, y)TTj(x)7Tj(y) , j = 1 xeN yeN subject to £ 7Tj(x) = l, xeN- u Ri, 1, xeRj, ) TTj{x) 10, xeRi, i ¥= jA j=l, TTj{x) =0, 1, xeN, j AN ALGORITHM TO GENERATE LOCAL OPTIMA A modified version of an algorithm of Carlson and Nemhauser [2] is presented which generates a local minimum to the isolation problem. No proof of this algorithm is given since it is so similar to that of Carlson and Nemhauser. STEP 0: Define a feasible partition, Pj,j=l, . . ., r, of the nodes xeN. STEP 1: Select a free node, xjRj, j= 1, . . ., r, which has not been examined since the last change in the partition. 468 M. BELLMORE, G. BENNINGTON, AND S. LUBORE STEP 2: Let s denote the index of the set in which x is currently assigned (i.e., xeP s ) Calculate dj(*)=2 c(x,y)TTj(y)J=l, . . ., r. ycN Note that the movement of node x from set s to set t will result in a change in the isolation cost of d s (x) — d t {x). STEP 3: Calculate A = max {dj(x)}. j STEP 4: If A ^ d s (x) , then there exists a set P t such that d s (x)—d t (x) <0 and, hence, a solution with a smaller isolation cost may be found. Move node x into set t and go to Step 1. If A = d s (x) , and a free node has not been examined since the last change in the partition, go to Step 1. If A. = d s (x), and all free nodes have been examined since the last change in the partition, termi- nate; a local minimum has been found. COMPUTATIONAL RESULTS A computer program for the isolation problem has been written in FORTRAN IV for the IBM 360/50 computer. The decomposed problems were solved utilizing the revised simplex method with product form of the inverse together with the branch and bound algorithm. The subproblems were solved utilizing the Out-of-Kilter algorithm as previously described. The computational results for a set of representative problems are given in Table II. Table II. Computational Results for the Isolation Problem Problem No. No. of nodes No. of edges No. of sets No. of IBM 360/50 edges execution in Do time (sec.) 1 12 18 3 6 20.7 2 15 30 3 12 206.6 3 15 50 3 13 725.3 4 15 50 4 17 588.4 5 15 50 5 21 401.4 6 15 50 6 24 325.3 The algorithm for generating local minima was also programmed in FORTRAN IV, all problems solved to date have been completed in less than 15 sec. ACKNOWLEDGMENT The authors wish to thank the MITRE Corporation, under whose auspices this work was performed, and Professor George Nemhauser of Cornell University for his many helpful suggestions and comments. NETWORTH ISOLATION ALGORITHM 469 REFERENCES [1] Bennington, G. E., "Multi-Commodity Disconnecting Sets in Undirected Graphs by Node Partition- ing," Ph. D. Thesis, The Johns Hopkins University, Baltimore, Maryland (1969). [2] Carlson, R. C. and G. L. Nemhauser, "Scheduling to Minimize Interaction Cost," Operations Research, 14, 52-58 (1966). [3] Dantzig, G. B. and P. Wolfe, "The Decomposition Algorithm for Linear Programs," Econometrica, 29, 767-778 (1961). [4] Fulkerson, D. R., "An Out-of-Kilter Method for Minimal Cost Flow Problems," J. Soc. Ind. and Appl Math., 9, 18-27 (1961). [5] Greenberg, Harvey, "Optimal Attack of a Command and Control Network," Ph. D. Thesis, The Johns Hopkins University, Baltimore, Maryland (1968). [6] Jarvis, John J., "Optimal Attack and Defense of a Command and Control Communications Net- work," Ph. D. Thesis, The Johns Hopkins University, Baltimore, Maryland (1968). [7] Land, A. H. and A. G. Doig, "An Automatic Method of Solving Discrete Programming Problems," Econometrica, 28, 497-520 (1960). [8] Lawler, E. L., "The Quadratic Assignment Problem," Management Science, 9, 586-599 (1963). [9] Steinberg, L., "The Backboard Wiring Problem: A Placement Algorithm," J. Soc. Ind. and Appl. Math. Revue, 3, 37-50 (1961). RESOURCE ALLOCATION FOR TRANSPORTATION* G. Bennington North Carolina State University^ and S. Lubore The MITRE Corporation ABSTRACT A linear programming formulation is described that will permit the optimal assignment of transportation resources (vehicles) and movement requirements to a network consisting of a set of designated origins, ports of embarkation, enroute bases, ports of debarkation, and destinations to achieve a prescribed schedule of arrivals. INTRODUCTION This paper describes the Resource Allocation For Transportation (RAFT) linear programming model, which allocates transportation resources and movement requirements to a set of routes for the optimal achievement of a prescribed schedule of arrivals. Several linear programming solutions of ship routing and scheduling problems have been pre- sented. Laderman, Gleiberman, and Egan [3] have investigated two problems of scheduling iron ore shipping on the Great Lakes. Fitzpatrick, Bracken, O'Brien, Wentling, and Whiton [2, 7] have investigated problems of determining the least cost mix of vehicles (air and sea) to satisfy time phased military shipping requirements. Rao and Zionts [5] give an algorithm to determine a set of cyclic ship routings for a nonhomogeneous fleet of ships that will minimize the total shipping costs. Their approach decomposes the linear program into a set of subproblems, each of which is a network flow problem. Dantzig, Blattner, and Rao [1] attack this problem in a different fashion in order to determine a set of vehicle routings that have minimum cost to time ratio. This determination is made by solving a series of shortest path problems in a network that represents all possible cyclic shipping schedules. Schwartz [6] presents a linear integer programming formulation of a problem dealing with the movement of known cargos from origin to destination on time at minimum fleet cost. Lewis, Rosholdt, and Wilkinson [4] give a comprehensive formulation of a problem dealing with strategic mobility, which they then treat as a single commodity flow problem through a time expanded network. *The work reported in this document was sponsored by the Defense Communications Agency under contract F19628-68-C-0365. tThis work was performed while the author was at the MITRE Corporation. 471 472 G - BENNINGTON AND S. LUBORE The RAFT model differs from existing strategic mobility planning LP models in the following respects: (a) It permits the optimum allocation of both transportation resources and movement requirements to available routes from origin to destination. (b) It allows for the transshipment of movement requirements subject to constant delays at enroute nodes and transshipment points. (Queues are not allowed at enroute nodes or transshipment points.) PROBLEM DESCRIPTION The problem addressed by this paper is one of moving military forces comprised of personnel and materiel from origin to destination over a network consisting of a set of ports (nodes) and vehicle oriented routes in order to achieve a desired schedule of arrivals. For the purposes of this paper, each item to be moved will be called a movement requirement (or requirement). Associated with each movement requirement is a unique origin and destination, together with a date when the requirement becomes ready to be moved from its origin and a preferred delivery date at its destination. Further, each such requirement is restricted to a given quantity of a particular commodity. The set of transportation resources is composed of the inventory of military aircraft and ships as well as commercially available land, sea and air vehicles that may be chartered for military use. The transportation resources are partitioned into sets of constrained and unconstrained vehicles. Time- varying vehicle characteristics are considered as well as the availabilities of vehicles as a function of time. Additionally, vehicle characteristics may vary for each of the routes in the network. For each origin-destination pair in the network, there are defined one or more directed paths consisting of sequences of nodes and arcs. The nodes correspond to the physical points in the network and the arcs correspond to vehicle movements between points. Each node in the network may be con- strained by the maximum vehicle handling throughput capacity for each time period considered in the planning period for the movement. Each of these directed paths is designated as a vehicle oriented route from an origin to a destination. In particular, each route may have any number of enroute stops for vehicles using the route, and any number of transshipment points at which the requirements being moved are transferred from one vehicle type to another. Figure 1 depicts a typical vehicle oriented route. A requirement shipped over this route travels from port A (origin) by train to airport B (transshipment point). At airport B, it is transferred to an air- craft for shipment by way of airport C (enroute stop) to airport D (transshipment point). At airport D, the requirement is transferred to a truck for eventual delivery to area E (destination). At each enroute node appropriate transit delay times (loading times, unloading times, and stopover times) are assessed in the determination of vehicle use and the times at which the requirement reaches each of the nodes. FORT A AIRPORT B AIRPORT C AIRPORT AREA E -*• ^TTTTr •"• T^TZ •"• PLANE PLANE TRUCK Figure 1. A typical route If a particular movement is to be feasible, there must be residual capacity at each node in the network at the times corresponding to vehicle departures from these nodes.* Additionally, sufficient 'Destination nodes are constrained by vehicle arrivals. RESOURCE ALLOCATION FOR TRANSPORTATION 473 vehicles must be available at the appropriate times to move the requirements over the several routes. No queueing is allowed at intermediate nodes of a route; however, fixed delays may be assessed at each node. MATHEMATICAL FORMULATION NOTATION Movement Requirements The set I represents the set of movement requirements. For each requirement iel, we denote its origin by s, and its destination by t,. Each requirement consists of a nonnegative quantity F, of per- sonnel or materiel to be moved. Time Periods J will denote the set of time periods allowed for a movement. Specific subsets of J will be defined later in the discussion of the constraint equations. Nodes The set P represents the nodes in the network. Each origin, transshipment point, enroute node, and destination is an element of the set P. Transportation Resources The set K is defined as the set of vehicles to be considered for a particular movement. Each vehicle keK is an element of the subset of constrained vehicles K\ C K or the subset of unconstrained vehicles K> C K. It is assumed that the subsets K\ and K> are mutually exclusive and collectively exhaustive. Routes The set N represents the set of all vehicle oriented routes in the network and the nth route is represented by the sequence [pf, (&?),p 2 n , . . .,p», (£«), . . ., (AgJj.pg], where p q 'eP = the qth node in the nth vehicle oriented route and (kq)eK= the arc representing the type of vehicle moving from p£ to p, +1 . A route segment is a portion of a route [p», (*J), . . ., (AjLj.pjJ] that employs the same vehicle, i.e., "■q "-9+1 • • • "-r-l ' where Pq and p"= transshipment points, an origin or a destination. 474 G. BENNINGTON AND S. LUBORE A route, therefore, consists of a series of route segments corresponding to the sequence of vehicles employed on the route. For convenience in defining the constraint equations in the later sections, specific subsets of the set of routes will be defined. The set of routes over which requirement i may move is denoted as N^ C N={neN | Si = p», ti = pfi , where 5; = origin for requirement i, and U— its destination. The set of routes incident with node p is denoted as /V (2) C N={n€N\P'q=P}- The set of routes over which vehicle k may travel is denoted as N(3)q yV={ ne /V| (k^) = k}. Decision Variables The set X represents the set of decision variables in the linear program. That portion of the ith movement requirement which leaves its origin in the yth time period traveling over the nth route is denoted as xi,j, „■ A decision variable Xi,j, n *X is defined for each requirement for each feasible shipping time period for each valid route. CONSTRAINT EQUATIONS The set of feasible solutions is defined by the following set of linear inequalities: Requirement Constraints y Di, fj, k xv, j, n > Fi, iel (1) «*i" where X^ = {xv, jt neX\n€N^', i'el}, i Di, i',j,fc= the amount of requirement i accompanying a unit of requirement V on vehicle k if shipped in time period j (note that £>,, ,,j, a-= 1), and Fi— the total amount of requirement i to be moved. These constraints assure that at least enough of requirement i will be delivered, any overshipment represents residual vehicle capability. If accompanying cargo is not allowed, these constraints are equalities. Node Constraints The nodes in the network may be constrained to reflect their vehicle handling capability and/or to reflect the outloading, throughput, or reception capability. RESOURCE ALLOCATION FOR TRANSPORTATION 475 If C h „ is the maximum number of vehicle departures (vehicle arrivals for a destination) the con- straint equation is: £ f^ ^ C j>p , jej, P eP (2) **S ViJ ' k -" The set X { r } lt is defined as: X^ixur.neXlneNpnNM-J'cJffl, where y ( . 2) = the set of time periods for each of the shipments x,,/,,, which leave (arrive at a destina- nation) node p in time period j. Vi,j',ie,n — the capacity of the A:th type vehicle to carry the ith requirement in time period/ over the nth route. If C'i P represents a node outloading, throughput, or reception capacity the constraint is ^ Ai,j,nXi,j', n ^ C'j, p , jeJ,peP, xeX" J.I' where Aij,„ = reflects the per unit consumption in thejth time period of node capacity corresponding to shipment x,,/,„. Vehicle Constraints (3) X TT^ * M i>«< &> keK > j.k The set X { ?\. is defined as X$= {xi,j>, n eX\neNMnWV; fejffl, where J$\ — the set of time periods for each of the shipments Xi,y,„ which utilize the &th vehicle type in they'th time period. Bi,j,k,n = the total carrying capability of each vehicle of the kth type when employed during the yth time period on the nth route carrying the ith requirement. In the computation of Bij.k.n, the average utilization rate of a vehicle can be taken into consideration. Mj,k = the number of vehicles of type k available for use in time period j. OBJECTIVE FUNCTION Since the movement requirements have preferred delivery dates, the objective function must consider both the allocation of scarce resources and the scheduling of arrivals of the requirements at their destinations. Hence, in order to deliver the requirements as close as possible to the preferred delivery dates, the objective function minimizes the total weighted deviations about the preferred delivery dates. 476 The objective function is: G. BENNINGTON AND S. LUBORE (4) minimize ^T Wi,j,nXi,j,ji: xtX fhere Wij,„ = a weighting factor. Weighting; Factors The weighting factors are functions of the deviation of the arrival time period of the ith requirement at its destination, if shipped in the jth time period over the rath route, from the preferred delivery date of the ith requirement. In general, each Wjj, „ may be an arbitrary function of the deviations consistent with the subjective utilities of early or late deliveries as well as any routing or vehicle preference. Figure 2 illustrates possible utility functions to perform the scheduling. Figure 2(A) depicts a situ- ation where early arrivals are always preferred over late arrivals, and the weights vary linearly with respect to the deviations. Figures 2(B) and 2(C) correspond to equal trade-offs between early and late deviations; Figure 2(B) minimizes the absolute value of the deviations, while Figure 2(C) minimizes the square of the deviations. Other weighting factors are easily envisioned and the particular choice should be made consistent with the specific movement to be analyzed. PREFERRED DATE PREFERRED DATE PREFERRED DATE FIGURE 2. Possible weighting functions to minimize deviations In the analysis used to date, the weighting function defined in Figure 2(A) has been used. Point B is the preferred delivery date for some requirement and is given a weighting factor value of one. Any RESOURCE ALLOCATION FOR TRANSPORTATION 477 deviation from point B has a weighting factor greater than one. The size of the weighting factor is con- sonant with the particular region of the curve within which an arrival occurs. Regions of Weighting Factor Curve Early Infeasible Region. This region covers the periods of time that the requirement is either not ready to be moved or cannot possibly arrive. In a particular movement, arrivals that are more than a fixed number of time periods prior to the preferred delivery date could be inadmissible. In this region, a weight would not be assigned, rather the decision variable would be suppressed. Early Region. This region is represented by the points from A to B, corresponding to the feasible early arrival periods. This region is always preferred to an arrival in the late region. The weighting factor is assigned a value of one at point B and is increased in steps of unity for each preceding time period. For example, if point A corresponds to an arrival, two time periods prior to the preferred delivery, the weighting factor assigned to shipments arriving in the time period corresponding to point A has a value of three. Late Region. This region, represented by the points from C to D, corresponds to a late arrival. Point C is assigned a weight that is one greater than any other requirement's maximum early arrival weight. The weights assigned to points beyond point C increase in value in unit increments. Hence, in the maximum possible early weight for any requirement is 40, the first late weight would be 41. In this manner, trade-offs between early and late arrivals are always biased towards the early end of the scale for the requirements to be moved in a particular problem. Late Infeasible Region. This region of the curve corresponds to deliveries that would arrive beyond the planning period for the total movement. This region may also correspond to deliveries that are in- admissible, since they arrive more than some fixed number of time periods after the preferred delivery period. In this region, the decision variables would not be assigned weights, instead, they would be suppressed. MODEL OPTIONS Alternative Objective Functions Other alternative objective functions may be utilized, depending upon the particular type of problem to be analyzed. Minimum Time Solution. In some strategic movements, it is desirable to accomplish the overall movement as quickly as possible, subject to the constraints described above. Figure 3 depicts a weight- FlGURE 3. Weighting function for minimum time 478 G - BENNINGTON AND S. LUBORE ing function that will minimize the weighted delivery periods of all requirements. If a trade-off exists in the same arrival period between one requirement arriving early and another late, preference is given to the requirement that would arrive after its preferred delivery period. The discontinuity between points B and C is used to resolve such a trade-off. Minimum Cost. Rather than minimizing the deviations from a set of preferred delivery periods, it may be more important to minimize the total cost of the movement. In this case, the objective function becomes: (5) minimize V c-,, j, „x,-, j, „, XfX where Cj, i /,»=the cost of shipping a unit of the ith requirement in the 7th time period over the nth route. Maximum Flow. If the object is to maximize the total flow of commodities over the network subject to the vehicle and node capacity constraints, the objective function becomes (6) maximize ^ Xi.j.n- xt.X Secondary Optimization In some instances, it is not only desirable to attain a minimum deviation or minimum time solution, but also to achieve this objective at the minimum cost. This can be accomplished by first solving the minimum time problem and then annexing the objective function to the constraint matrix to maintain an optimal solution to the original problem. The linear cost function is then minimized.* This results in the least cost solution that has the minimum total deviation or minimum time. Trade-Off Bias Among Commodities The weighting functions considered in the case of minimum deviation or minimum time solutions have so far considered a unit of each commodity to be of equal value. Since in many cases, a single passenger is not of equal value to a ton of cargo, it is desirable to give preference to a unit of the more valuable commodity if competition for scarce resources exists. One way to eliminate this trade-off bias is to multiply each weighting function value by an additional factor. Hence, if each passenger delivered on time is considered to be as valuable as 400 pounds of cargo, each weighting factor for the passenger requirement would be reduced by one-fifth. In this manner, every five passengers delivered are equiva- lent to a single ton of cargo delivered. Similar factors may be introduced for other commodities. Priorities Among; Movement Requirements Each requirement can be given a priority to favor its shipment over other shipments in a given time period if enough resource capacity is not available to ship all competing requirements simul- taneously. In general, each requirement will be delivered eventually, so it is only the increments of weights between arrival times that are important and not thier absolute values. Consider the case of two requirements with the weighting functions shown in Figure 4. The higher priority requirement corresponds to the line segments A' B and CD'. Because the slopes of A' B and *There are several alternative methods of implementing a secondary optimization, but this method has yielded good results for the authors. RESOURCE ALLOCATION FOR TRANSPORTATION 479 CD' are twice that of AB and CD, respectively, the incremental penalty for deviating from the pre- ferred delivery period is greater for the higher priority requirement, and, hence, the higher priority FIGURE 4. Weighting function with two priority classes requirement will always be given preference over the lower priority requirement. In addition, the distances from B to C and from B' to C tend to force the late delivery of the lower priority requirement before that of the higher priority requirement. In the general case of many priority classes, the slopes are increased in proportion to the priorities, and the step from B to a delivery one period late is greater than the largest possible weight for the next lower priority. In this manner, a number of different priorities may be handled. Movement Requirement Mode Preference In some movements, a shipping mode preference (air or sea) is given for each movement require- ment. Three options are available in the model with respect to modal preference. (a) Fixed Mode — The requirement is forced to travel by the preferred mode by suppressing all of the decision variables for which the preferred mode of the shipment does not match that of any vehicle on the route. (b) Preferred Mode — The objective function weights are decreased by a small quantity, say 8, if the corresponding Xi,j, „ results in a shipment over a route that uses vehicles consistent with the pre- ferred mode of shipment. (c) Mode Indifference — The mode preference, if any, is ignored. Vehicle Preference by Cargo Type In some movements certain vehicles may be preferred for the shipment of one cargo type over another. The model specifies an ordered preference of vehicles for each cargo type. In a manner similiar to the treatment of a movement requirement mode preference, the objective function is biased to reflect preferences for vehicle usage. 480 G. BENNINGTON AND S. LUBORE Table I represents a typical vehicle-cargo preference relationship. In the table, the lower numerical values represent the higher priorities. Vehicle 1 is preferred over all other vehicles for the shipment of cargo type 1 and vehicle 3 is preferred for cargo type 4. The relative preference for utilization of vehicle 2 is the shipment of cargo type 2 first, followed by cargo types 3, 1, and 4. Table I. Vehicle-Cargo Preference Cargo Type 1 2 3 4 1 1 6 10 10 /ehicle Type 2 5 1 2 6 3 10 7 2 1 Increased Aircraft Utilization The productivity of an aircraft type is reflected in the value of Bij, k, « in the vehicle constraint equation (Eqs. (3)). The value of Bi,j,k,n is based upon the long term average utilization rate of an aircraft over a given route. However, for short periods of time, this average utilization rate might be exceeded as long as the long term average value is maintained. For these periods of increased utiliza- tion, a higher aircraft productivity B{ y jy ki „ may be used. The constraint equation, Eqs. (3), can be modified for aircraft to allow for increased utilization over some time periods to compensate for less than average utilization in preceding time periods. This is accomplished by defining a variable T h u corresponding to this under utilization. This variable may then be carried forward into subsequent time periods where it serves to augment the available vehicle productivity. The modified Eqs. (3) become (7) xtX<: x i,f, n + Tj, k - 7j_i, a- *2 M h k , jej, keKr , with the understanding that 7o, a = 0. Additionally, it is necessary to add a second set of vehicle constraint equations to insure that the amount of slack capacity does not exceed the total possible maximum vehicle productivity in a given time period. (8) 2 -jfF^^Mj,*, jeJ,keKt If it is also desired to restrict greater than average vehicle utilization to a single time period, the following constraint equations are used instead of Eqs. (8): (9) n(M hk - X JT J1 ^)-T J ,^0, jeJ,keKu RESOURCE ALLOCATION FOR TRANSPORTATION 481 where CI— a positive constant. Equation (9) will cause the slack variable Tj,k to vanish (due to the nonnegativity restriction on linear programming variables) should the maximum productivity be reached. The term in parentheses acts like Eq. (8), and the value of the constant, Cl, determines the maximum value of slack that may be carried forward. If vehicles are allowed to operate at their maximum utilization rate for two or more successive periods (if sufficient unused slack capacity is available), the variable 7j,a in Eq. (9) may be replaced by Tj + m-i,k, where m is the number of successive periods allowed. ALTERNATIVE FORMULATIONS Alternative formulations and modifications were investigated during the formulation of the linear programming model described above. Two of these warrant discussion, since they highlight future areas of research or give a better understanding of why some models have not been developed in spite of their theoretical tractability. Route (Arc-Chain) Formulation The previous formulation may be described as a route (arc-chain) formulation. The variable Xi,j, „ corresponds to the flow over the nth route and the constraints upon the vehicles employed on the route are generated by entering the variable x%,j, n into each of the appropriate vehicle constraint equations, using the implicit assumption that no storage is allowed at any of the transshipment nodes and that the next vehicle is available when required. Node-Arc Formulation A node-arc formulation defines a new variable Xi,j, Q corresponding to a shipment on the qlh vehicle oriented arc carrying requirement i in the 7th time period. Using this formulation, it is possible to con- sider in-transit storage of requirements as long as enough of each requirement i is delivered at the initial node of arc q before it can be transshipped over the arc. This requires additional constraint equations at each transshipment node m of the form: (10) ^ Xi,y, q ^ ]£ Xi.y.q, ieljej, xeB(m) xtA(m) where B{m) — the set of shipments of requirement i into node m that are available for transshipment in or prior to time period/, and A (m) = the set of shipments of requirement i out of node m in or prior to time period/. This is just an inequality form of the conservation of flow equations, with the excess of input over output taken as storage at the mth node. In addition, it is possible to place an upper bound upon the quantity of goods being stored. The node-arc formulation has two attractive features: (1) inventories at nodes are allowed;* and (2) the number of variables only increases linearly with the number of route segments. The distressing A limited capability to incorporate in-transit storage within the arc-chain formulation may be accomplished by defining additional variables corresponding to the storage of a requirement at a transshipment node for a fixed number of time periods. 482 G. BENNINGTON AND S. LUBORE feature of this formulation is the increase in the number of constraint equations required. An ad- ditional constraint equation is required for each transshipment node, for each requirement, in each time period. The linear program must manipulate a larger basis, and may require additional auxiliary storage, which sharply increases the computation times. For the above reasons the route formulation was chosen for implementation. The additional capability of a node-arc formulation appears to be outweighed by the increased computational burden. Although the number of possible routes in a graph can literally explode as the complexity of the net- work increases, it does not seem to be a practical limitation, since many sets of routes can be excluded a priori, and the computational effort resulting from the additional variables does not seem to be prohibitive. REPOSITIONING OF VEHICLES The previous formulations have explicitly made assumptions about the distribution of vehicles to shipping nodes in the generation of the vehicle constraints. A common assumption is that a vehicle "cycles" back to the point from which it left. Under this assumption a vehicle used for a shipment is constrained from further use until it returns to the shipping node. Upon its return, it is made available for immediate reuse at any other shipping node in the network. For some problems this may be a serious error. If a ship returning from London to New York is immediately reused at San Francisco without a transit time delay to reposition to the West Coast, an unrealistic situation exists. Vehicle repositioning may be taken into consideration by defining repositioning nodes m for each vehicle and defining a new decision variable yj,k,m,m> corresponding to the number of empty vehicles of the Arth type repositioned from node m to node m' in time period j. It is necessary to add additional vehicle repositioning constraint equations to maintain a flow balance at each of the nodes m and m' . CASE 1 : m is an unloading point for the Ath vehicle type: (ID £ f^^^ 2 ».*.-.*. keKujeJ, JctHim) " x >}'< A '> " ytA(m) where B(m) — the set of shipments using the Arth vehicle type that are available for repositioning from node m in or prior to time period^, and A{m) = the set of vehicles of type A: repositioned from node m in or prior to time periody. CASE 2: m' is a loading point for the kth vehicle type: x i ,.i' , ■ V ■• , (12) ^ yj'.ft.m.m'S* 2 T/.':f'," ' keKufef, yeA(m') yeB(m') where A{m') = corresponds to the vehicles of type k repositioned to node m' for reuse in or prior to time period j, and B(m') — the set of shipments out of node m using vehicle type k in or prior to thejth time period. In either case, the slack variable in the constraint equation corresponds to the inventory of vehicles at node m or node m' . Although these additional constraints seem to enhance the model, it must be remembered that the model is a linear program with continuous variables. Hence, the number of vehicles repositioned RESOURCE ALLOCATION FOR TRANSPORTATION 483 is not necessarily of integral value and the added capability for repositioning may not lead to a more realistic solution. COMPUTER MODEL Computer programs have been written to generate the matrix elements corresponding to Eqs. (1) through (9) with the exception of the maximum flow option, and to generate reports resulting from the linear programming results. These programs are written in FORTRAN IV to be used in conjunction with the Control Data ALEGRO Linear Programming System on a 3600 computer or to be used with the IBM mathematical programming system (MPS) on 360/50 or 360/65 computer. The authors have found that by inserting "artificial" variables with high objective function weights into the appropriate constraint equations, the linear program running time is significantly reduced from the running time resulting from the use of the two-phase method of the revised simplex algorithm. Only limited computational results have been obtained to date, however Table II summarizes the computational experience obtained while testing the model formulation on a set of sample problems. All of the samples were run on an IBM 360/50 computer. The execution times do not include the time for matrix generation or the reformatting of the linear programming solution. Table II. Computational Results LP Parameters Problem c escription Problem number Rows Columns Density (%) Iterations Execution time (min) Time periods Require- ments Routes Vehicle types 1 55 252 6.55 41 0.49 12 6 4 4 2 55 252 6.55 17 0.79 12 6 4 4 3 73 199 5.81 28 0.50 12 12 8 4 4 85 244 5.61 23 0.45 12 12 28 4 5 133 293 3.48 46 0.70 12 12 28 8 6 263 2924 1.32 136 4.91 17 74 108 13 7 390 862 1.54 107 2.67 25 110 35 12 8 390 1422 2.34 112 4.29 25 110 77 12 Problems 1 and 2 consider a network of only two nodes and four parallel routes between these nodes. This network corresponds to a movement of requirements from a single POE to a single POD utilizing four distinct types of vehicles. The rows of the constraint matrix are composed of 48 vehicle constraint equations, one per vehicle type per time period, together with six requirement constraints. The remaining row is the objective function. Problem 1 minimizes the overall closure time of the movement, and Problem 2 minimizes the weighted deviations of the movement requirements about their preferred delivery dates. Although Problem 2 requires less iterations to achieve an optimal solution, it requires more computational time. Behavior of this type is not atypical of the computational experience obtained with most commercial LP codes when applied to small problems and is most likely due to the heuristic multiple pricing rules or reinversion policy employed in the iterating agenda. 484 G. BENNINGTON AND S. LUBORE Problem 3 is an extension of the first two problems and includes a second POE-POD pair, an en- route base for aircraft, and an increased number of movement requirements. In attaining the minimum deviation solution, the number of aircraft sorties through the enroute base was constrained, however, transshipment was not allowed. Problems 4 and 5 allow transshipment and include two origins, two POE's, one enroute base, one POD, and one destination. The problems differ only in the number of constrained vehicle types. One of the POE's and the enroute base are constrained. Problem 4 did not consider the origin to POE or POD to destination vehicles to be limited in number. Problem 5 placed an upper bound upon the number of these short haul surface vehicles available. Problems 6 through 8 consider larger problems that are similar in structure to Problem 5. As before, the number of iterations required and the computational times do not appear to obey any fixed relationship with respect to the problem size in terms of the parameters of the linear programs. REFERENCES [1] Dantzig, G. B., W. O. Blattner, and M. R. Rao, "Finding a Cycle in a Graph with Minimum Cost to Time Ratio with Application to a Ship Routing Problem," Operations Research House, Stanford University, California, Report 66-1 (Nov. 1966). [2] Fitzpatrick, G. R., J. Bracken, M. J. O'Brien, L. G. Wentling, and J. C. Whiton, "Programming the Procurement of Airlift and Sealift Forces: A Linear Programming Model for Analysis of the Least- Cost Mix of Strategic Deployment Systems," Nav. Res. Log. Quart., 14, 241-255 (1967). [3] Laderman, J., L. Gleiberman, and J. F. Egan, "Vessel Allocation by Linear Programming," Nav. Res. Log. Quart., 13, 315-320 (1966). [4] Lewis, R. W., E. F. Rosholdt, and W. L. Wilkinson, "A Multi-Mode Transportation Network Model," Nav. Res. Log. Quart., 12, 261-274 (1965). [5] Rao, M. R. and S. Zionts, "Allocation of Transportation Units to Alternative Trips — A Column Generation Scheme with Out-of-Kilter Subproblems," Operations Research, 16, 52-63 (1968). [6] Schwartz, N. L., "Discrete Programs for Moving Known Cargos from Origins to Destinations on Time at Minimum Bargeline Fleet Cost," Transportation Science, 2, 134-345 (1968). [7] Whiton, J., "Some Constraints on Shipping in Linear Programming Models," Nav. Res. Log. Quart., 14, 257-260 (1967). INTERVAL ESTIMATION OF THE NORMAL MEAN SUBJECT TO RE STRICTIONS, WHEN THE VARIANCE IS KNOWN* Saul Blumenthal New York University Bronx, New York 1. INTRODUCTION AND SUMMARY We consider here the problem of obtaining confidence interval estimates of the mean of a normal distribution when the mean is restricted to be nonnegative, and the variance is known. The problem of point estimation for truncated parameters has been considered by Katz [13] and Farrell [9]. For the interval estimation problem, we shall work with a loss function (1.1) L{dJ)=c(8 2 (-)-8 1 (-)) + {° 1 ™>^°*w, where the interval estimate /( • ) = [8i( • ), 82 ( • )] is closed with upper and lower end points 82 and 81, respectively, and 6 is the mean being estimated. The loss function (1.1) was used by Joshi [12] in demonstrating admissibility of the standard interval estimate for the normal mean in the unrestricted case. Dudewicz [8] considers this loss function in constructing interval estimates for ranked means. For the unrestricted mean problem, Ferguson [11, p. 184] mentions this loss function, and variations on it have been used by Aitchison and Dunsmore [1] where the expected size of the error rather than the error probability appears in the risk function, and by Konijn [14 and 15] where the risk function used the noncoverage probability and the probability of covering incorrect values. The problem of constructing confidence intervals for restricted parameters has been considered in some detail for the logistic distribution by Bartholomew [3] and has been discussed in general by Stein [18] and Bartholomew [2]. In Section 2, we obtain Bayes estimates with respect to folded normal prior distributions, and demonstrate the admissibility of the generalized Bayes interval obtained by taking a uniform prior distribution on (0, °°). The admissibility is subject to the same equivalence class restrictions as in Joshi [12] and the reader is referred to that paper for a full discussion of admissibility for interval estimators. The risk functions of the generalized Bayes estimator and a natural truncated estimator are discussed in Section 3, along with the question of minimaxity. The truncated estimator is shown to be minimax. In Section 4, we consider sequential stopping rules where sampling cost is linear and we find rules which are asymptotically pointwise optimal, and asymptotically optimal Bayes rules in the sense discussed by Bickel and Yahav [6]. In section 5, we replace c by h{6) in (1.1) and examine how the length of the resulting Bayes intervals is related to h{6). *Research supported by National Science Foundation Grants GP-7024 and GP-7399, School of Engineering and Science, New York University. 485 486 S - BLUMENTHAL 2. BAYES INTERVALS AND ADMISSIBILITY The observations A\, . . . X n are independent with common normal distribution N(6, v 2 ), with known variance v l . The averaged is then N(6, (v 2 /n)). To avoid continually writing (v 2 /n) , we define (2.1) a 2 = v*ln, Y=X. Corresponding to the loss function (1.1), we have the risk function (2.2) R(d,I)=cEe(d,(Y)-d 1 (Y)) + l-P (8 l (Y)^d^8 2 (Y)). To generate Bayes estimates, we let 6 have a normal distribution with mean and variance t 2 , confined to 3=0, i.e., (2.3) g T (d) = (2l7TT 2 )^e- e2 l^ 2 6^0 = 6><0. The posterior density of 8 given Y is — — (e-yy--)- (2.4) gx(0|y) = {<P(y/o-y)}-'{y 2 /277o- 2 }'/ 2 e 2 - 2 , >j = 0<O, where (2.5) y 2 =l + (o- 2 /r 2 ). For the purpose of finding the Bayes estimation interval, we characterize interval estimators by a function £(0, y) , where (2.6) £(d,y) = li£8 1 (y)^d^8 2 (y) = otherwise. LEMMA 2.1: The Bayes estimation interval for prior densities (2.3), loss function (1.1) is given by (2.7) 8,(F) = max [0, M(y, Y) -D(y, Y)] 8 2 (Y) = max [0,M(y, Y)+D(y, Y)], where (2.8) M(y, F) = F/y 2 and (2.9) D(y, F) = {(2o- 2 /y 2 ) log [(y 2 /277o- 2 ) '/^(F/o-y)]}'/ 2 , provided that INTERVAL ESTIMATION OF NORMAL MEANS 487 (2.10) c< (7 2 /2ir<r 2 ) 1 ' 2 [«l>(I7<ry)]- 1 . If (2.10) is violated, the interval shrinks to a point which can be chosen arbitrarily as any nonnegative number. PROOF: Using (2.6), we can write £L(0,/) = 1+[ X g T (d)dd f* [c(8 2 (y)-8,(y))-£(0,y)]/(y|0)rf>r (2.11) = l+r My)dy[ X [c-g 7 (d\y)]t(0,y)dd, Jy=-zc Jti = o where Fubini's Theorem justifies changing the order of integration and the conditions for using the theorem are verified as in Joshi [12, Lemma 5.1], and where (2.12) My) = (2/7ry 2 T 2 )'/ 2 cp(y/o-y)e^ 2 / 2 ^ 2 >. The expression in (2.11) will be minimized by minimizing the inner integral on the second line for each y, namely by taking (2.13) £r(e,y) = lifg T (d\y)^c = otherwise. Since g r (0\y) is unimodal and monotone on either side of the mode, it is seen that (2.13) and (2.7) are equivalent. Condition (2.10) is needed to guarantee that g T (6\y) ^ c for some 6. This completes the proof. NOTE: Condition (2.10) assures that the cost due to length does not dominate the cost due to nonconverage. Since any point estimate has zero length and zero coverage probability, the degeneracy in the location arises. Because condition (2.10) depends on F, it is possible for small F values to lead to nondegenerate intervals, while larger F values lead to the degenerate situation. To avoid this, it seems desirable to add a stronger condition assuring nondegenerate intervals for all observations. Noting that (l^Tro- 2 ) 1 ' 2 ^ (y 2 /27ro- 2 ) 1/2 ^ (y 2 /27ro- 2 ) 1/2 [$(y/o-y)] "'■ we can eliminate the dependence on F by using (y/2Tro~ 2 ) 1/2 as a bound for c. This still leaves the uniqueness dependent on t (through y), and when we take limits as t— » °°, we will want c such that a nondegenerate solution holds for allr. This suggests imposing the condition (2.14) c< (1/27TO- 2 )'/ 2 , which we shall do henceforth. Even under (2.10), it is possible to have a degenerate interval at zero because M + D may be negative. The requirement that M + D > 0, all F reduces (for F < 0) to (2.15) (ca V2Wy)<P(y/o-y) < e - ( ^ 2 ' r ^ 2) . 488 s - BLUMENTHAL Since it is always true that <i>(x) < e~ x2 l' 2 (x =s 0), it is seen that the stronger condition (2.14) assures M + D > 0, so that no point estimates will be incurred. Condition (2.10) alone does not insure (2.15). We summarize the above remarks in Corollary 2.1. COROLLARY 2.1: The Bayes estimation interval I r for prior densities (2.3), loss (1.1) and (2.14) are nondegenerate for all Y and t, with 8i(Y) given by (2.7), and (2.16) 8 2 (Y) = M(y, Y)+D(y, Y) > 0. We shall now study the generalized Bayes rule obtained by taking a uniform distribution on (0, °°). It is obtainable from the previous results as indicated below. We denote it by/ p . COROLLARY 2.2: Under the conditions of Corollary 2.1, the uniform generalized Bayes interval estimate is obtained by letting t— » °°, and is given by (2.17) 8,(r) = max [0, Y-D(Y)] 8 2 (Y) = Y+D(Y), where (2.18) D(Y) = {-2o- 2 log [c<TV2^-4>(y/o-)]}" 2 . Next, we study the risk function of the Bayes estimator. LEMMA 2.2: The risk function of V is given for d ^ by (2.19) R(6, I T ) = dy^[6<S>(yu(y) - (0/cr)) - (cr/ V^e-./^M-Wa))*] + 2V2(o-/y) J* [-iogcV2^-(o-/y)cD(y- 1 (y+(e/o-)))] 1/ V(y)^y r- f[yu(y)-(eio-)) . 1 - V2(<r/y) J [-logcV2^(o-/y)<P(y- 1 (y+(^/cr))] 1/ V(y)^y] + l-{<P(yu i (y, 6)-(dla))-<P(yu 1 (y, 0)-(0/<r))}, where Ui(y, 6) and u 2 (y, 6) are the two solutions of (2.20) <P(*) = (ylca)ip(x - (yd/a) ) and u(y) = u 2 (y, 0) is the unique positive solution of (2.20) for = 0. PROOF: Since D(y, Y) > a.s., it is immediately obtained that (Y/y 2 ) ^ D(y, Y)(i.e., 8,(7) > in (2.7)) if and only if <D(y/y<r) 2* (ylc<r)<p(Yly<r), Y ^ 0. Furthermore, ^(u) is a strictly increasing function of u while <p(u) is a strictly decreasing function on [0, °°]. Hence, (F/y») 5= D(y, Y)<^>(Ylya) > u(y). Therefore, INTERVAL ESTIMATION OF NORMAL MEANS 489 Ee{82(Y)-8 l (Y)}=2E e {D(y, Y)I [y<TU(y)M {Y)} + E e {{M{y, Y) + D(y, Y))h- m .y„ m (Y)} 2V2cr y f ™ , / c V2ttct , ( u cp( ~ + _x_ y cry 1/2 <p(u)du y J-oc ly- y- L \ y \y cry//j J It may be immediately verified that this yields the terms in (2.19) which are multiplied by c. Write the noncoverage probability as P e [8 2 (Y) < 0] + P e [8i(Y) > 5= 0]. The first of these can be written as P{M + D < 0} = P{D 2 < {S-M)\ d> M} =P{cp(y/cry) > (ylc(r)<p((YI<ry) - (y0/cr)), Y < y 2 0} and similarly for the other probability. Combining these, gives as the coverage probability (2 . 21) ^ r(y ^r )1 <(OT/ 4 Note that Eq. (2.20) does have two solutions when > 0, since (2.14) assures [<p(x— (yd/a) )/<t>(x)] > (ca/y) atx= (y0/cr) , and lim [<p(x- (ydla))l<&{x)] = lim [<p{x - (ydl<r))l<&(x)] = 0. The probp bility statement in (2.19) follows from (2.21) and straightforward manipulation. NOTE: The risk function of I p is obtained by setting y = 1 in (2.19). We shall now examine the difference between the Bayes risks of I T and I p . Let (2.22) r( T ,I)= j R(d, I)gr(6)dd denote the Bayes risk of an estimation interval /. In order to bound this difference, we must first look at the functions in(y, 6), i = 1, 2 defined by (2.20). LEMMA 2.3: The solutions Ui(y, 9) of (2.20) satisfy for all 0, (2.23) My, 0) ~ (70/°-) |> ^logty/co-V^)]" 2 , £=1,2. Also \iii(y, 0) — (y0/cr)| is monotone decreasing as increases, with a limit of [2 log (y/ccrV27r)] 1/2 . The functions Ui(y, 0) are monotone increasing in 0(i = 1, 2). For all sufficiently large, (0 > 0*, say) (2.24) I My, 0) - (yd/o-) | < [2 log (y/co-V^)] 1 / 2 ^ [(2ccr/y)e- ,/2(v8M ] 1/2 . Finally, the functions u(y) = u>(y, 0) satisfy for all t i? 1, (2.25) < u(y) - u(l) < (cr/2c) (1 + (cr 2 /2) ) (1/r 2 ) The proof of this and the following lemma will be given in the Appendix. LEMMA 2.4: The difference in Bayes risks between / p and P is 0(t" 2 ), i.e., for some B (2.26) T 2 [r(T, I p )-r(r, P)] < B, for all t. Not only is the left side of (2.26) bounded, but it also has a limit as t — * °°, which we state below. COROLLARY 2.3: The limit as t -> °° of the left side of (2.26) exists and is given by 490 s - BLUMENTHAL (2.27) Jim r 2 [r(r, /") - r{r, V) ] = ca4 (I + V2) [- log ca V2^] " 2 + ( Wo" 1 ) [~ lo S c °" V§W]-W The proof of this Corollary is given in the Appendix. Next, we establish our main result, THEOREM 2.1: Let I p be the interval estimator (2.17), and /' any other estimator such that (2.28) R(d, I')^R(dJf) for all ^ 0. Then /' is equivalent to I p , i.e., tj'(x, 6) =£»(%, 6) for almost all (x, 6)eRx (0,oo), where tj(x, 6) is defined by (2.6). NOTE: With respect to the loss function (1.1), Theorem 2.1 establishes the admissibility of I p among all confidence set estimators up to an equivalence class of procedures. Without the qualification of "up to an equivalence class," or some restriction on the class of confidence set procedures, no interval estimator can be admissible. This is due to the discontinuous nature of the loss function (1.1) so that any estimator / can be modified on a set of two-dimensional Lebesgue measure zero to im- prove its coverage probability on a set of 6 values having measure zero. (Joshi [12] discusses this in some detail.) Theorem 2.1 shows that if (2.28) holds, strict inequality can hold only on a set of 6 values having Lebesgue measure zero, i.e., "almost" (or "weak") admissibility, but the conclusion of equiva- lence of the estimators is a stronger one than the conclusion only of "almost" admissibility. If the class of procedures is restricted to those which yield closed intervals for every observation, then Theorem 2.1 establishes admissibility without any qualifications regarding equivalence classes. Further, Theorem 2.1 establishes the "strong admissibility" of I p according to the definition used by Joshi [12, Section 7]. PROOF OF THEOREM 2.1: We follow the line of Joshi's [12] argument, making appropriate modifications. Let (2.29) v/here U<0(y) = i + £" [c-g x (e\y)]e n (y, d)dd = E{L(0,I^)\Y = y}(i=l,p), (2.30) g x (0|y) = {cp(y/o-)}->{277o- 2 }- 1/2 e-i< fl -'" i . Clearly U (,) (y) is minimized when (2.31) £ (i) (y, 8) = lforg m (0\y)&c = for gA&\y)<c, i.e., for / <p) . Thus, either (I) U (I) (y) = U (p) (y) almost ally in R or (II) U (ll (y) > U <p) (y) for subset S of/? with positive measure. INTERVAL ESTIMATION OF NORMAL MEANS 491 If (I), then the desired equivalence can be established as on page 1065 of Joshi [12]. We show that II leads to a contradiction. Note that II implies that there exists a positive number k such that for some a (2.32) J <P(y/o-)[U <1, (y)-U , " ) (y)]^y=A. We now show as in Joshi [12] that A: = 0, achieving the desired contradiction. Using (2.11) we see that (2.33) £{L(6>, /<>>)-L(0, /<">)} = Wiry^ 2 ) 1 ' 2 f" G r (y)dy, where (2.34) G 7 (y) = e-<" 2 ^^»cp(y/o-y)|[c(8<>»(y)-Si"(y))-| () X ^(0|y)^"(y, d)dd] - [c(^»(y)-«W(y)) - f * *■(* | y)^>(y. 0)d0]}, and g T (d | y) is given by (2.4). The limit asr^" of g T (0 | y) is gx(0 | y) of (2.30), and the Helley Bray lemma justifies taking the limit inside the integrals of (2.34) so that by (2.29) we find (2.35) limG T (y)=<P(y/o-)[U (1| (y)-U ( P»(y)]. Next we justify the interchange of limit and integration (2.36) lim J ' G T {y)dy= \" lim G 7 {y)dy=k t— »°» J —0 J —n T •*" by showing that | G T (x) \ < G(x), in particular (2.37) | G T (y) | < G(y)=c{(8^(y) -8\»(y) + (8^>(y) -8^(y))}+2. By definition (2.17), [8i p) (y) ~8[ p) (y)] is bounded uniformly. Now consider (2.38) c \ " [8<»(y) -a<»(y)]«fy=c f " [8[y(y) -8<'>(y)] — L— e-W*"*. J -a ' J -a VlTT a • (V5iro-)e ( » 2 / 2tr ^y go- V2W'' 2 ^ 2 >£ w=0 {c(8< 1, (y)-8 , 1 1) (y))} go- V2^fe ( " 2/2<r2) «(0, /c>)go- v^e ( " 2/2cr2) /?(0, /<">), the last inequality from (2.28). Thus G(x) is integrable over (—a, a), and from (2.36) we conclude that given e > 0, there is a To s.t. for all t > To (2.39) f G T (y)dy>k-e. J -a Now we observe that from the construction of a Bayes procedure, (2.40) G T (y) ^e- ( « 2 /^ 2 »cp(y/o-y)|rc(8|(y)-8[(y)) - [ * g T (0 | y)f'(y, 6>)</0 c(s«">(y) -8</"(y)) - J * *r(0 1 y)£ (p, (y, 8)do} = D T {y) g 0. 492 S. BLUMENTHAL Thus, (2.41) (2/7ryV)'/^|"V|^G 7 (y)rf y ^(2/7ryV)'^(T"''+r"W(y)^ i? (2/7ryV)>/ 2 i D T (y)dy=r(r, I T )-r(r, I"). J — zc By using (2.34), (2.39), and (2.41) and assuming t„ > 1, (2.42) r(r, /'") -r(r, /p) ^ T ^ ( ( *~* } 2) ~ [r(r, /p) -r(r, !>)]. By (2.28), r(r, / (,) ) -r(r, / p ) < 0, so that (2.43) k =i W( ^"°" 2) r[r(r, /p) - r(r, L)] + 6. From Lemma 2.3 and the arbitrary nature of e, it follows that A: = and the proof of the Theorem is complete. 3. RISK OF Ip AND MINIMAX ESTIMATION. We shall now investigate the behavior of the uniform generalized Bayes interval as a function of the observed mean Y, and the risk function of I p . We shall introduce a "truncated" estimate and show that it is minimax. It has not been possible to demonstrate the (conjectured) minimaxity of I p . LEMMA 3.1: The length of the interval I p is a unimodal function of Y with maximum of (3.1) 2{-2cr 2 log [co-V2^D(a(l))]}" 2 at Y=au(l), where u(l)(>0) is given by (2.20) withy= 1, = 0. The length has a limit of (3.2) 2{-2o- 2 log [CO-V277]} 1 / 2 as Y increases, and behaves like (3-3) (o- 2 /|F|) log \Y\ as Y decreases to (— °°). Both limits, 8i(y), d-ziy), are monotone increasing functions of Y, and 62(F) > 0, ally. PROOF: From the characterization (2.30) and (2.31), we see that the endpoints 8j(y) (£=T, 2) of I p satisfy the equation (3.4) {2tto- 2 } ,l2 e~^ »-y)' = c4> {yja), so that (8 — y) decreases monotonely as y increases. The length is 2(8 — y) if y^cru(l), where u{\) is defined by (2.20), and the length is just 8 2 (the larger root) if y < cru(l). The assertion that 8» is monotone increasing with y is equivalent to the assertion that the equation INTERVAL ESTIMATION OF NORMAL MEANS 493 (3.5) &(x)-(ca)<p{a-x)=0, a >0 has only one solution x with x < a. The latter statement follows by noting that (3.5) is positive and increasing for x<a — y 2 (say), and is decreasing for a — y> < x < a — y-i (say), then is increasing for a — y\ < x, and is negative at x = a. The monotonicity of 8 t is easily verified by implicit differentiation of (3.4). Putting 3>=1 in (3.4) gives (3.2), and using the tail approximations for (1 — <1> (jc) ) when x is large (Feller [10, p. 166]) gives (3.3). This completes the proof. Next we look at some aspects of the behavior of the risk function R(d, I p ). Let 7° denote the in- terval with end points Y±{ — 2cr 2 log [ccr V2rr] } 1/2 which is the Pitman type estimation interval for unrestricted 6 and was shown by Joshi [12] to be admissible and minimax with constant risk (3.6) 7? = 7?(6>, 7°)=2[c{-2c7 2 log [ca V2ir] }" 2 + 4>(- {-2 log [ca VYtt\ J 1 ' 2 )]. Comparing the risks of the two estimators, we obtain: LEMMA 3.2: The risk function R (6, I p ) is discontinuous at 0=0 with R(0-, I*) = 1 + c\2V2<t J [- log ca V¥n-<P(y)yi 2 <p(y)dy (3.7) -Via!" 1 [-\ogcaV2^<$>(y)yi 2 <p{y)dy-(T<p(u(l))\ = R + ®(u(l))[l- {ca)(ca+ u(l))] >1 > R. (3.8) 7?(0, Ip)=R(Q+,Ip)=R(Q-, /p)-<D(u(1)) = R - (ca)<P(u(l))[ca+ u(l)] < R, and (3.9) lim R(0,I p )=R. 0- +0C PROOF: The expected length of I p is a continuous function of 6, but the coverage probability drops to for 6 < 0. This accounts for the first part of (3.8) and (3.7) which represents R(0— , I p ) as unity plus the expected length at zero. The other equality of (3.7) comes from using the change of variables v={— 2 log ccrV27r<l>(y)} 1/2 to evaluate the integrals. The first inequality comes from the positive character of expected length. The second comes from the fact that any point estimator has constant risk of unity, and 7° being admissible must have smaller risk. By taking limits inside integrals in (2.19), and using Lemma 2.3, (3.9) is verified, completing the proof. NOTE: The argument used to establish the first inequality in (3.7) can be extended easily to show (3.10) R(0, I") > 1 >7? all0<O. The preceding lemma suggests very strongly that the risk function of 7 P is =7? for all i? 0. However, we have been unsuccessful in attempting to prove this boundedness. The motivation for desiring it is the fact that any estimator whose risk is so bounded is minimax. This minimax property can be established by essentially the same proof as is used in Blumenthal and Cohen [7, Theorem 3.1]. 494 S. BLUMENTHAL One estimator which is minimax is the obvious "truncated" estimator 7 7 based on 7°, namely (3.11) 8f(y) = max [0, Y-D]; 87(F) = max [0, Y+D] where (3.12) D= {-2a 2 log [co-V^]} 1 ' 2 , with the interval becoming a point for Y <— D. Clearly, for > 0, del <-> 0e7 r so that both intervals have the same coverage probability and I T is never longer than 7°. Thus R(d, I T ) = R, all and I T is minimax. The risk function of I T is given by (3.13) R(6, I T ) = ca{D[l-(t>(D-(dl(T))+<i>(D+(dla))]+(T[e(D+ (8l<r)) -<p(D-(9l<r))] + e[&{D+{dlo-)) + &{D+(dl<r))-l]} + \-[Q(D)-(l-h(0, 0))<P(-7))] = R-ca{[®(D-(dl<r)) + ®(D+(dl<r))][D-6] + d-a[<p(D+ (dla))-<p(D-(dl&))]-h(0, 0)®(-D), where LU otherwise. From (3.13), we see that lim R(8, I T ) = R. Although I T is minimax, it is not likely that it is admissible since its length is not a continuous function of y. Clearly, generalized Bayes intervals will have continuous lengths so that I T would not be in the generalized Bayes class. If this class is complete as is the case for point estimation (see Sacks [17]), then I T could not be admissible. We close this section by taking a brief look at the coverage probability of I p as a function of 6. LEMMA 3.3: The coverage probability of l p given by (3.14) P(0) =<1>(M1, 6) - (0/o-)) -<D(a,(l, 0) ~ {filar)) % is monotone decreasing in 6 with maximum (for 5* 0) of <I>(u(l)) and minimum of (<i>(L) —3>(— /_,)), where the u's are given by (2.20) and i (3.15) L={-2 log [co-V2^]} 112 . PROOF: The proof follows directly from using Lemma 2.3 in (3.14) and (3.14) comes directly from Lemma 2.2. To compare with the case of unrestricted 0, let the coverage probability in that case be (1 — a). Then! satisfies <t>(L) — <£>(—L) = (1 -a) , and from (3.15), ccr = (p(L). Thus, from (2.20), u(l) satisfies (3.16) Ma)/<D(«))=<D(L) andP(0) decreases from <P(u(l)) to (1 — a) as increases. From (3.15) we see that u(l) > L, and if a is small so that L is large, 4>(u) ~ 1 and u(l) =L, giving max P(0) = 7^(0) =<t>(L) = (1 — (a/2)). (In fact, P(0) > (1— (a/2)).) Also, from above we see that <p(u) =$>(u)(p{L) ><S>(L)(p(L) so that INTERVAL ESTIMATION OF NORMAL MEANS 495 u(l) < u* where <p(u*) =<$>(L)<p(L) = (1 - (a/2))<p(L), and P(0) <<P(u*). Numerically, we consider two examples, a = 0.05 and 0.10. a D 1 - (a/2) P(0) = «D(u(l)) 4>(u*) 0.05 1.96 0.975 0.9755 0.976 0.10 1.64 0.95 0.952 0.953 It seems evident that for the usual ranges of a values, (1 — (a/2) ) is quite sufficient as an approximation toP(0). We might also note that the above method of bounding P(0) suggests an iterative procedure for solving (3.15), namely define u, + i by <p(u i + i )=<b(ui)<p(D), £ = 0,1, . - . (tio=L). This gives a sequence whose even (odd) numbered terms are less (greater) than the desired solution and whose even (odd) terms are monotone increasing (decreasing) to that the sequence converges to u(l). 4. THE POSTERIOR BAYES RISK AND SEQUENTIAL PROCEDURES In this section, we shall consider the posterior expected loss given the observations for the Bayes procedures of Section 2. We shall study a sequential rule which is in effect a curtailment of the fixed sample rule and which guarantees a uniform bound on R(6, 1). Also for linear sampling cost, a sequen- tial Bayes rule which is Asymptotically Pointwise Optimal (A. P.O.) in the sense of Bickel and Yahav [4-6], and asymptotically optimal will be given. The posterior expected loss for the Bayes procedure is given by the minimum value of the inner integral of (2.11), namely (4.1) r(Y) = r+(Y) = l + 2cD(y,Y)-[<P(M(y,Y)lC)]->[2<P(D(y,Y)IO-I] iiY^ayu(y) r(Y) = r-(Y) = l+c[M(y, Y) +D(y, Y)]- [4>(M(y, F) /£)]-'[<!> (D(y, ¥)IQ -cp(-M(y, Y)IQ] ify^o-ya(y), ^here (4.2) £2 = or*/y 2 and u(y) is defined by (2.20), y by (2.5), M(y, Y) by (2.8), and D(y, Y) by (2.9). We shall write simply M and D where this causes no confusion. By straightforward differentiation, we find that r + (Y) is a monotone increasing function of Y with a maximum of (4.3) 2[c£{ -2 log c£V27r}</2 + <D(-{-2 log c £ V2ir}>/ 2 )] The square bracketed term in (4.3) goes monotonely to zero as £ — * (i.e. as the number of observations n, increases). The function r~(Y) is also monotone increasing in Y with minimum of at Y= — °°. Thus the posterior risk r(Y) is monotone in Y with maximum (4.3). 496 s - BLUMENTHAL If a Bayes confidence interval is sought in the case of unrestricted 6(—°° < 6 < °°) relative to the prior density (2.3) also on the whole line, with loss (1.1) the resulting estimate M(y, Y) ±D(y, °°) has constant posterior expected loss (4.3). Thus if a sample size no is chosen so that this interval estimate will have Bayes risk ro, the same risk can be achieved in the case of restricted 6 by a sequential rule which stops as soon as r(Y) < ro, and the sequential rule will never take more than no observations. We shall now consider the stochastic limit as n increases of the random variable r(Y). From this, it will be possible to construct A.P.O. sequential stopping rules. LEMMA 4.1: Let the random variable r(Y) be the posterior Bayes risk for interval I 7 . Then (4.4a) lim (n/log n)" 2 r(F) =0 if 6 < 0, w.p.l, (4.4b) lim (ra/log n) 1/2 r(F) =2c*> if 6 > 0, w.p.l, n— * o° (4.4c) lim {nl\ognyi-r{Y)=cv if = 0, w.p.l. PROOF: From (4.1), (4.2), the definitions (2.1), (2.5), (2.8), (2.9), and the tail approximation to the normal, we find (4.5a) lim (rc/log n) 1/2 r+(*) =2c^, x > ( (log n)/n) 1/2 , n —* oo (4.5b) lim (n/log n) 1/2 r-(*) =0, x < 0, x fixed, (4.5c) lim (nl\ogn) i l*r-(x)=cv[(a 2 + l) ll2 -a], x = ~av{ (log n)/n) J / 2 , a > 0, (4.5d) lim (ra/log n) 1/2 r"U) =ci^(a+ 1), x = av((\ogn)ln) l l 2 , a > 0. From the definition (2.20), we find (4.6) ayu(y) = v V(log n)\n+o ( V(log n)jn) as n increases. Using the fact that Y^d w.p.l as n increases, the fact that r(Y) = r + {Y) if Y > ayu{y) , (=r~(Y) otherwise), along with (4.5a), (4.5b) and (4.6) we see that w.p.l r(Y) = r + (Y) if 6 > 0, (= /- (Y) if 6 < 0) , and that (4.4a) and (4.4b) hold. For 6 = 0, we use the law of the iterated logarithm to conclude that w.p.l, |F| < 2V(2 log log n)/n, so that w.p.l, (n/log n) ll2 r(Y) is given by either (4.5c) if F< or by (4.5d) if Y > 0, with a = 0. Both expressions agree in that case, giving (4.4c), and completing the proof. It is immediately evident from the proof above that (4.7) sup (n/log n) "*E(r (Y) ) < oo, n where expectation is with respect to the marginal density / 7 (y). We now consider the construction of sequential Bayes stopping rules where the cost per observa- tion is a constant, Co, with stopping cost at stage n defined as (4.8) Con + L(6,I) with L{6, I) given by (1.1). Proceeding as in Bickel and Yahav (1968), we define the stopping rule: stop for the first n such that INTERVAL ESTIMATION OF NORMAL MEANS 497 (4.9) r(K)[l-((nlog(n + l))/(/i + l) logn)^]^C which is obtained by writing r(Y) as ((log n)/n) U2 V, and finding then which minimizes (nC + r(Y)). From Lemma 4.1 and (4.7), it is clear that the conditions of Theorems 2.1 and 3.1 of Bickel and Yahav [6] hold if (n/log n) 1/2 is read in place of n®. This substitution causes no difficulties in the proofs of these theorems. (The requirement that V > is met since none of our distributions put any mass on 6 < 0.) Expanding the expression in (4.9) we obtain the approximate rule: (4.10) stop as soon as (r(Y)l2N) == C . Summarizing the conclusions indicated above, we have THEOREM 4.1: For the loss (4.8), the stopping rules (4.9) and (4.10) are A.P.O. and asymptotically optimal in the sense of Bickel and Yahav [6]. 5. COST DUE TO LENGTH WEIGHTED BY A FUNCTION OF THE MEAN In this section, we take a brief look at the intervals which result from using a loss function of the form (5.1) U0,I) = h(d)(8 2 (Y)-8 l (Y)) + { if 8 l (Y)^d^8 2 (Y), 1 otherwise where the relative importance of the length of the interval depends on the size of the unknown mean being estimated, and the constant c of (1.1) is absorbed in h{6). Proceeding as in Lemma 2.1, we find the Bayes solutions to have the form (5.2) f(0,y) = l i{gA0\y)^E[h(d)\y], — otherwise where (5.3) £[M0)|y]=r h(0)g T (0\y)de, J 0=0 and g T {6\y) is given by (2.4). This leads to intervals of the form 8,(y) = max [0, M(y, Y)+G(y, Y)] (5.4) S 2 (F) = max[0,A/(y,F)+G(y,y)], where M(y, Y) is given by (2.8), and (5.5) G(y, F) = {(2cx 2 /y 2 ) log [(y 2 /27ro- 2 )" 2 /£(M0) \Y)<P(Y\ay)]}^. For a solution of (5.2) to exist, some restriction on h{6) is needed, such as sup h(d) ^ (2tto- 2 )- 1/2 , O=S0<x which is quite severe, but which also assures that 82(F) > 0, all Y. As we vary h(d), we can obtain quite different properties for the resulting interval, the effect of h(6) entering through the half-length G(y,Y). 498 S - BLUMENTHAL The requirement that the 8, be monotone can be written as G(y,Y)-M(y, Y) + [B(Y, 6h(6))/B(Y, h(d))] -0 for 8,, (5.6) G(y, Y)+M(y, Y)-[B(Y, dh(d))/B(Y, h(d))] >0 for 8,, where for any function r(9), (5.7) B(Y, r(d))= I* r(e)e L <y 2 l 2 « 2 " e -"iy< Y » 2 dd. Jo If it is desired to chose h(d) so that the length of the interval behaves for large M approximately like a given function /(M), h(6) must at least approximately satisfy (5.8) B(Y,h(d)) = e -iy^W). Note that the condition 8 2 (Y) > can be written (5.9) B(Y, h(6)) < e-W"^' 2 for Y < 0. For/(M) =kM (A. any constant), (5.8) and (5.9) may conflict. From (5.7) we expect as a first approxima- tion that for large M, B(Y, h{d)) = h(M), so that in view of (5.8), we would try (5.10) M0)«e-W for some A > 0, in order to have length approximately/(M) for large M. WhenM is negative, the length is not 2C(y, Y) but M + G and will not necessarily be of the form/(M). In fact, asM— » — °o we generally want (M + G) — » 0, regardless of the form of the length for larger M. Paulson [16] suggests a rationale for desiring a half-length of order \M(k < 1) for large M and ap- proaching for small M in a problem where 6 represents the mean of a difference between two normal variables. We shall construct here an interval having this property by choosing h(d) =ce~ A92 . The resulting B(Y, h(6)) is (5.11) c(27r^/l + Z4C 2 ) 1/2 e- ,A,2/(1 + 2 ^ 2, cp(M/^Vl + 2^^), where (5.12) £ 2 = cr 2 /y 2 . Since (5.13) G(y, Y) = {-2C 2 log B(Y, h(0))Y'\ we see that for large M , the interval length is to a first order approximation (5.14) 2(2A^I(l + 2AC 2 )) m M. Using the tail approximation for <t>(—x) as x increases, we find that for K— » — <» (5.15) 8AY) = CH\og \M(y, Y)\)l\M(y, F)|+0(1/|F|) INTERVAL ESTIMATION OF NORMAL MEANS 499 which is the same as for h(6) = c in (3.3). From (5.14), we see that A can be chosen as \2/2£ 2 (1-a 2 ). Rewriting B{Y, h(0)) as (5.16) e-*Wl [ X fc(0)e-««»+*MV«Wd0 for M < 0, and noting that the integral in (5.16) is (£'-/i(0)/|M| ) plus terms of smaller order asM-* — oc , we see from (5.13) that 8>(Y) will always have the order of magnitude indicated in (5.15) as Y —* — °°. Sequential procedures can be developed from the posterior expected loss function for these loss functions just as we did in section 4 for the loss function (1.1). APPENDIX Proofs for Section 2 In this Appendix, we supply the proofs of Lemmas 2.3, 2.4, and Corollary 2.3, We start with PROOF OF LEMMA 2.3: Write v = u - {yd/a) in (2.20) so that we are looking for the two solutions (one positive, one negative) of (A.l) <&[v+ (ydla-)] = [yica)<p{v). The right side of (A.l) exceeds the left at y = 0. At y = ± [2 log ( Vy/c V2~7r) ] 1/2 , we have unity for the right side, which exceeds the left side. Since <P (x) is monotone increasing, and <p (x) increases for x < 0, and decreases for x > 0, (2.23) is proved. Noting that as 6 increases 4>[f + (yd/cr)] increases mono- tonely to unity for each v, gives the monotonicity and limit results for f,(y, 6). The monotonicity of ui(y, 6) follows from the defining relation (2.20). If Ui{y, d) > 0, then 4>(w,) > 1 — (<p(ui)lu,i), so that for v satisfying (A.l) (A.2) (y/co-Mi;) > 1 - (?[t;+ (yd/(r)]l[v+ (yd/a)]) or rearranging and taking logarithms e -l/2[(y0M 2 + 2i>(y0M]-] y 2 SO that - log V*n- + log (y/ca) + log 1 1 + (ca/y) [|;+(y g /(r)] } > log (y/co- V2^) + («r/y) [t;+(y9M] > , and since \/A + fi < V4 + VB, „-(l/4)[(y#/ < 7)2 + 2iXye/o-)] (A.3) [2 log (y/orVS)]* + (2co-/y)" 2 [|>+ (yg/<r)]lft > | V |. Because f,(y, 0) has a limit, for d > 9* (say), ^(y, 0) > (— k) (specified k) (and [(yd/a) — 2k] > 1) so that 500 S. BLUMENTHAL [2 log (y/co-V2^-)] 1/2 + (2co7y) 1/2 e-" 4 <^> > \v\ which proves (2.24). From (2.20), we have <S>(u(y)) -<D(u(l)) = (llc<r)(y<p(u(y)) - <p(u(l)) > 0. By Taylor's expansion where so that since and («(y) -«(1)) = (l/ar)(^(u(l))/^(u))[y(*»(tt(y))/#»(a(l))) - 1], u(l) *£ u *£ «(y), M«(l))/y?( tt )) < (v(«(D)/y«p("(y)))<i, (A.4) Mu(y))Ap("(D)) < 1, (u(y)-u(l)) < (l/co-)y[y-l] < (1/ccr) (1 + (o- 2 /2r 2 )) (<x 2 /2t 2 ) using Vl + x <1 + (ac/2). When t 5= 1, (2.25) follows from (A.4). This completes the proof of the lemma. Next, we give PROOF OF LEMMA 2.4: We break the difference (A.5) r 2 | o X [R(0, 1") - R(d, I')]g T (0)d6 into four pieces which will be bounded separately. First, consider the contribution due to noncoverage probabilities, namely (A.6) r 2 !* {[<P(yu 2 (y, 6) - (0/cr)) - 4>(yi*,(y, 0) - (0/tr))] Jo - [<P(M1, 0) - (0/o-)) -<*>(«!(!, 0) - (0l<r))]}g T (6)d0. Using the mean value theorem, we write the { } term in (A.6) as (A.7) (y-l)G(y*, 6) where 'dujjy, d) (A.8) c(y*, e) = 2 (-i)V(r*"f(y*.»)-(»M) and y dy * +u,-(y*, o) 1 s£y*s£y. Differentiating (2.20) with respect to y, writing u for u(y, 0), and simplifying, we obtain <p(u)(duldy) = (llca){(<p(u- (ydla)) -y[(du/dy)- (6I<t)][u- (dla)]<p(u- (6l<r))}, INTERVAL ESTIMATION OF NORMAL MEANS 501 and using (2.20) again, <p(u) {duldy)=<t>(u){{lly) ~ [(du/dy) - (0/<r)] [u- (Ola)]} so that (A.9) (duldy) = {(lly) + (0la)(u-(0l(T))}l{(<p(u)l<i>(u))+{u-(6l<r))}. Note that r 2 (y-l) < (o- 2 /2), so that we want to show the boundedness of /> (6)G(y*,0)dd (A. 10) and we have (A.11) G(y*, 0) = £ (-iy<p(y*in(y*,d)-(dl*)) {l + («,y*, y(M,-(y*. 0)) °' <t>(ui(y*,e)) m(y*,d)-y' \ui(y*,d)+y*£} - <p(ui(y*,d)) M»i(y*,o)) m(y*,e)-y* The denominator in (A. 11) is of order (1/ | u | 3 ) as u— > — <*>, so that when the <p(y*u— (0/cr)) term is taken into account, we see that G(y*,»TjT) is bounded as long as each u,(y*, 6) is bounded above. This is the case if 6 is bounded above, say by A. We next note from Lemma 2.3 that the denomi- nator in (A.ll) has a limit as 6— > °° and is bounded away from zero for 6 > A. Since (u<p(u)l<t>(u)) —*■ 0, as u— »• oo and we can write [u+ (y*6/(r)]— [u— (y*0/<r)) +2y*0/cr], where (« — y*0/cr)) is bounded in the region of interest, we see that for 6 > A, G(y*, 6) is composed of some terms which are uniformly bounded plus a term of the form <p(u 2 ) HfwK 7 ** 1 "! cp(u 2 ) r \ + (a 1 -y*^)(u 2 -y*^)[<,(y*a 2 -^)-^(y*u 1 -^)]} {[(* XT) <P(Ui)j u-z — y 6\ tpjwi) (j <P(u 2 ) where the arguments of U\ and u 2 are suppressed to save space. By Lemma 2.3, for large enough Ui > d — k, so that %>(ui) — » as 6 increases and the first two terms in (A. 12) are uniformly bounded. Since («i — y * 0/cr) has a limit by Lemma 2.3 as 6 increases, our attention can be limited to (A.13) f * 8gr(0)L(y*u 2 -^-v(y*u i -^ye=f~ (|)g r (0){r|^(y*u 2 -£) -<p(y*u, ~)]}dB which we now examine. Write the { } term as (A.14) <py(*u,--) T {e-« t »'*(«t-«i)tT*(«i+«iM««r)]_i}. 502 S. BLUMENTHAL Using Vi denned by (A.l) and the definition (2.5), the exponent in (A.14) becomes (A.15) ~\ (u 2 -u l )y*[(2do-lr^)+y*(v 1 + v 2 )], where t < t* < °°. By (2.24), we may take A sufficiently large so that (A.16) \v x + v 2 \<2 V(2co7y*) e'^* e '^ ) <2 V^ e' 8 ^ and (A.17) |u 2 -« 1 |<4[21og(y*/co- V2^r)] 1/2 . If (A.15) is negative, then (A.14) is negative, and (A.13) is bounded above by 0. Consider the set of 6 for which (A.15) is positive, i.e., (v x +v 2 ) < and 0< (20ct/t* 2 ) < — (vi + v 2 ), so that by (A.16) and (A.17), (A.15) is less than (A. 18) 4[4co- log (y*/co- V2tt) ] 1/2 y**e-^. Assume t large enough so that y* < y < 2, then it is clear for A large enough, there is a constant D say so that (A.14) is less than (A.19) DTe- e l**<D{Tie), so that (A.13) is bounded by (A.20) (* Dg T (0)dd<D. Putting together the above pieces, we obtain the boundedness of (A.6). Next, we must examine the con- tributions to (A.5) due to the interval lengths. The first two terms of (2.19) represent (l/y 2 o-V2^) ye *"* {y dy, J -oc which contributes to r(r, I 7 ), (A.21) 1 ) ! X e 2 ' 2 d0 Jo £j*V- ( -""<fr TTcry 2 T J -m ye 2<r2 dy e e L2t2 2<ji J dd 0-7T77 INTERVAL ESTIMATION OF NORMAL MEANS ytt(y) 503 = (TV^/y 2 V^) <t>(yu(y))- <l» Vi + (o--7t 2 ] Vi + (o-7t j ) y'wMy) e 2(1 + (t2/(t2)) The corresponding term for r(r, l p ) is given by replacing y by 1 in (A.21). When the two terms are differenced, and (2.25) is used, the contribution to (2.26) will be found by straightforward computations to be 0(1/t). Next, we examine the term (2V2o-/y) [" [-logc V^t (aly)<t>(y-<(y+ (8la)))yi><p(y)dy, which when averaged w.r.t. g T {6) becomes by changing the order of integration (A.22) (4o-/Vtt(1+ (t 2 /o- 2 ))) J x [-y~ 2 log c \f2^{crly)^{xly)y i2 ^{xly)e-^k^)dx. A similar expression for the term in t(t, I p ) replaces y by 1 in the square root term, but is otherwise the same. The contribution to (A.5) is then, omitting constants, writing (1 + (t 2 /ct 2 )) as y 2 /(y 2 — 1), and writing/) for (cV27rcr), (A.23) (T 2 (V7=T)/y) f* {[log (l/£><P(*)r/ 2 -(l/y)[log ( y ID® (x/y )]' ' 2 } -4>(*/y)< J — oo {jcHy2-l)l->y*\(ij- Rewrite the <$>(x/y){ } term as [-logZ)cpU)] I / 2 [cp(x/y)-cpU)] + {(p(x)[-logZ)cp(x)] 1 / 2 -(l/y)(P(x/y)[-log(D/y)cD(x/y)] 1/2 }, and use the mean value theorem to write it as (y — !)//(*, y*), where 1 < y* < y, and H(x, y*) = (l/y* 2 ){-^(x/y*)[-log£>(p(x)] 1 / 2 + [-\og(Dly*)<S>(xly*)Y l2 -{<l>(xly*) + (xly*)*p(xly*) + (l/2)[-log(D/y*)$U/y*)] 1 [(x/y*)-^U/y*)- ( PU/y*)]}}. It is easily seen that if y =£ 2 (say), using the tail approximations for the normal distribution, that H{x, y*) is bounded uniformly in x and 1 =£ y* =S 2. Since T 2 (y — 1) is bounded also, we have for some constant J, that (A.23) is less than 7((Vy ^TT/y) j' e -[*Hv*-my*\dx= V2^7. The remaining term in (2.19) contributes to r(r, V) a constant times the integrand of (A.22) inte- grated over the range (— °°, yu(y)). Break the range into (— °°, u{\)) and (u(l), yu(y)). Over the 504 S. BLUMENTHAL range (— °°, u(l)), look at the difference written as in (A.23) and it is seen that the same bound applies. What remains is the integral over (u(l), y{u(y))). In this range, the integrand of (A.22) is bounded uniformly and by (2.25), the length of the range is 0(t~ 2 ). This completes the proof of Lemma 2.4. Next we prove Corollary 2.3. PROOF OF COROLLARY 2.3: To evaluate the limit of (A.5) as t 2 ^°°, we use the first terms of Taylor expansions in place of mean value theorem approximations, i.e., use <p(ui(l,d)) / 0M-' + Ml, 0) l*(m(l,0)) V v • ' o- and H(x, I) = [-\og D<S>(x)yi 2 {$>(x) +(H2)[-\og D<l>(x)]- 1 (x<p{x) -<P(x))} in place of G(y*, 6) and H(x, y*), respectively. In (A. 10) substitute = 777 and consider the limit, HmG(l, tjt)=2co-{[-2 log ccrV2^] 1/2 + [-2 log ccrV2^]- 1/2 }. Noting that r 2 (y— l)^'(o- 2 /2), we have for the limit of (A. 6), (A.24) ca 3 {[-2 log co-V2ir"]" 2 + [-2 log caV^]- 1 ' 1 }. In (A.23), usez = %((Vy 2 -l)/y) and < A - 25) "S^v^fM W \_ [ [-log co-V^] 1 ' 2 - (1/2) [-log co- V2^]-'/ 2 ifz>0 v-i V Vy 2 - 1 / 10 ifz<0 to obtain a limit for (A.23) of (A.26) V2o- 3 {[- log co-V2^] 1 / 2 -(l/2)[- log ccr V2^r]- 1/2 }. For the difference over the range (— °°, "(1)), note that after transformation, the range is (-00, B (l)[(Vy r =T)/y] which approaches (— °°, 0) and noting (A. 25), this contributes nothing. The integral over («(1), y"(y)) is easily seen to be 0(t' 3 ) and contributes negligibly. The net result is that (after multiplying (A.26) by c) we have (2.27) and the proof of Corollary 2.3 is complete. REFERENCES [1] Aitchison, J. and I. R. Dunsmore, "Linear-Loss Interval Estimation of Location and Scale Pa- rameters," Biometrika, 55, 141-148 (1968). [2] Bartholomew, D. J., "Contribution to the Discussion of the Paper by C. M. Stein, 'Confidence Sets for the mean of a Multivariate Normal Distribution,' " J. Roy Statist. Soc. B, 24, 291-292 (1962). INTERVAL ESTIMATION OF NORMAL MEANS 505 Bartholomew, D. J., "A comparison of Some Bayesian and Frequentist Inferences," Biometrika, 52, 19-35 (1965). Bickel, P. J. and J. A. Yahav, "Asymptotically Pointwise Optimal Procedures in Sequential Analysis," Proc. Fifth Berk. Symp. Math. Statist. Prob. (Univ. of California Press, 1965). Bickel, P. J. and J. A. Yahav, "Some Contributions to the Asymptotic Theory of Bayes Solutions." Stanford University Dept. of Statistics, Technical Rept. No. 24 (1967). Bickel, P. J. and J. A. Yahav, "Asymptotically Optimal Bayes and Minimax Procedures in Se- quential Estimation," Ann. Math. Statist. 39, 442-456 (1968). Blumenthal, S. and A. Cohen, "Estimation of the Larger Translation Parameter," Ann. Math. Statist. 39, 502-516 (1968). Dudewicz, E. J., "Multiple-Decision (Ranking and Selection) Procedures: Estimation," Un- published Thesis, Cornell University (1969). Farrell, R., "Estimators of a Location Parameter in the Absolutely Continuous Case," Ann. Math. Statist. 35, 949-998 (1964). Feller, W., An Introduction to Probability Theory and Its Applications (J. Wiley, New York, 1957), Vol. 1, 2nd ed. Ferguson, T. S., Mathematical Statistics, A Decision Theoretic Approach (Academic Press, New York, 1967). Joshi, V. M., "Admissibility of the Usual Confidence Sets for the Mean of a Univariate or Bivariate Normal Population," Ann. Math. Statist. 40, 1042-1067 (1969). Katz, M. W., "Admissible and Minimax Estimates of Parameters in Truncated Spaces," Ann. Math. Statist. 32, 136-142 (1961). Konijn, H., "Some Estimates Which Minimize the Least Upper Bound of a Probability Together with the Cost of Observation," Ann. Inst. Statist. Math. 7, 143-158 (1956). Konijn, H., "Minimax Interval Estimates with a Shortness Criterion: A New Formulation," Ann. Inst. Statist. Math. 15, 79-81 (1964). Paulson, E., "Sequential Interval Estimation for the Means of Normal Populations Interval," Ann. Math. Statist. 40, 509-516 (1969). Sacks, J., Generalized Bayes Solutions in Estimation Problems, Ann. Math. Statist. 34, 751-768 (1963). Stein, C. M., "Confidence Sets for the Mean of a Multivariate Normal Distribution," J. Roy. Statist., Soc. B, 24, 265-296 (1962). THE VALUE STATEMENT M. E. Nightengale U.S. Air Force Academy ABSTRACT Frequently there exists an insufficient amount of actual data upon which to base a decision. In this paper a method is presented whereby the subjective opinions of a group of qualified persons are utilized to quantify the relative importance of a finite number of param- eters or objectives. A means of testing the consistency among the judges is given which allows the decision maker to determine the validity of the opinions gathered. The application presented here is in the area of Multiple Incentive Contracting. Namely, a method is proposed to facilitate the answering of the question, "What is the value to the purchaser of an incremental change in the performance of a system?" Such a vehicle is not essential to the methodology proposed. INTRODUCTION A current trend in governmental procurement is the use of a form of systems analysis in structuring Multiple Incentive Contracts. The incentives are structured in such a manner that the government will obtain a product that results in the minimum total cost for development, procurement, and operation of the product in the fulfillment of a defined mission. The scale of payments from the buyer to the seller depends upon an explicit statement of future contingencies; that is, the contractor receives different rewards for different levels of performance. Where does this explicit statement originate? The search for an answer to that question is likely to unearth another question; namely, how was the mission defined? What thoughts, judgments, and opinions went into the definition of the mission? Notice that the question itself is phrased in subjective terminology. Projections into the future must necessarily be based on subjective evaluations rather than on quantitatively derived forecasts. Even if there were some mathematical model available, the input assumptions and the interpretation of the output are functions of the expertise of the individuals involved. Since there is no proper theoretical foundation, and since one must rely to some extent on sub- jective evaluations, there are but two options available to the buyer: (1) he can wait until there exist sufficiently adequate quantitative methods, or (2) he can make the best of a poor situation and try to use the opinions of a group of experts in some reliable and beneficial manner. Thus, one must almost inevitably arrive at the decision to use the best data available — subjective evaluations. This article will not delve into, nor include a discussion of, incentive contracting and the philosophy underlying it. The interested reader is invited to read the Air Force Academy Manual, "The Evaluation and Structuring Techniques of Multiple Incentive Contracts." Thus far one has seen that the questions of how the rewards are determined and, in fact, how the mission itself has been defined have led to the search for a method or methods of handling subjective evaluations. Methods for quantifying such evaluations exist in the literature today. One of the major problems in structuring incentive contracts has been that the proposed "solution" commences with the 507 508 M. E. NIGHTENGALE assumption that the fundamental problem has been solved! That is to say, a curve has been drawn, an equation postulated or some such scheme, and what remains to be accomplished is to manipulate the tangents or derivatives and arrive at the proper conclusion. The basic problem, however, remains unsolved. The specific statement often referred to as "the value statement" is merely a statement of the value to the government of an incremental increase, or decrease, in some parameter. In multiple incentive contracting, this "value" is measured in dollars. The problem posed is then one of somehow assigning these dollar values. Since this is such a specific problem, it will perhaps be of more interest and value to present here a method for handling the general case; that is, assigning numerical values to qualitative judgments in discrete cases. The application of a general method to the specific problem will be readily apparent. First, consider a discrete problem which may be exemplified by having a group of judiciously selected experts choose the most desirable parameters to incentivize from among all of the parameters. The basis for this method of solution is the Law of Comparative Judgment, described by Thurstone [6] in 1927. A detailed explanation and illustrative examples of this method may be found in Nightengale [4, 5]. An explanation of the computational procedure will be presented here. COMPUTATIONAL PROCEDURE The computational procedure associated with the law of comparative judgment applied to ranking the parameters involves the tabulation of the number of times parameter i is judged more significant than parameter j. This tabulation is shown for n parameters in Table I. Table I. Matrix A: Number of Times Parameter i Judged More Significant Than Parameter y Parameter i Parameter./ 1 2 3 . . . j . . . n 1 a r > Ol3 a,j a, n 2 a-n a 23 a-,j ai„ 3 an a 32 a 3 j a 3 „ i a n a,i a« a,j a,„ n a„\ a, ,1 <?,!.! a„j a,j= the number of times parameter i is judged to be more significant than parameter j, i +j. It should be noted that ties or indifference judgments are not allowed. This is consistent with the assumptions involved in Thurstone's Case Five [6, p. 45], upon which this procedure is based. Although the computations introduced by allowing ties could be easily handled, the law of comparative judgment provides for only two choices. One should also be aware that if the number of parameters to be judged is too great, fatigue and indifference will become important factors. The main diagonal of Matrix A THE VALUE STATEMENT 509 will be left blank because a parameter is not judged against itself. The total number of judgments required to obtain the matrix is n(n— l)/2. Matrix P is constructed from Matrix A and is used to display the percentile of times of the total judgments that parameter i was judged more significant than parameter j (see Table II). Once again the diagonal will contain zeros. The elements of the P Matrix have the property that p,j + Pji= 1. TABLE II. Matrix P: Percentile of Times Parameter i Judged More Significant Than Parameter) Parameter i Parameter j Row totals 1 2 3 . . . j ■ ■ ■ n 1 PV2 Pia Pu P\n Pi 2 P21 P-23 Pi} P2„ Pi 3 P31 P32 P3J Pan P3 i Pn Pi2 Pa Pa Pin Pi n P,n P„l P„3 Pnj Pn Po = the percentile of times parameter i is judged to be more significant than parameter./, i =t=y. Pij=(iijlc, where c is the number of judges. Matrix Z is used to convert Matrix P into standard measurements of separation in terms of the equal standard deviations of the discriminal dispersion scale (see Table III). Table III. Matrix Z: Basic Transformation Matrix Parameter i Parameter/ Total Mean 1 2 3 . . . j . . . n 1 Zl2 213 Zij Z\n z, z, 2 221 223 *2j Ztn z 2 Zi 3 Zg] Z32 Z:ij Zin 2*3 z 3 i Zil ZI2 Zi3 Zij Zin z, z, n 2,il Zni Zn3 Ziij z„ z„ Zij= the separation between parameter i and parameter/ in terms of standard deviations, i =¥j. Zl . = 2zi ,z,=— • 510 M. E. NIGHTENGALE The element zy is the standardized normal deviate corresponding to the element p y . The element Zij will be positive for all values of py greater than 0.50 and negative for all values of py less than 0.50. Zeros are entered in the diagonal cells. The matrix is skew-symmetric; that is,2 2 i = — zi2- That this is necessarily so may be shown as follows: Py = G(z y )= P' j (27r)-" 2 exp {-zV2}dz, J -30 where zy is the unknown standardized normal varia'te. Since py = 1 —pj>, then \ Z ' J (27r)-'/ 2 exp {-z-/2}dz=l-( Zjl (27r)-" 2 exp {-z 2 l2}dz J —x J -x = f* (2tt)"»/ 2 exp {-z 2 l2}dz = [ *" (2tt)-^ exp {-z 2 /2}dz. JZji J- 30 Hence, zy=— Zjj. Matrix Z contains the sample estimates zy of the theoretical values found in the equation of the law of comparative judgment. The element zy is an estimate then of the difference (Sj — S,) between scale values of the two stimuli measured in units of the standard deviation of the distribution of discriminal differences. Each independent element of Matrix Z is an estimate of a value for one equation of the law of comparative judgment. Z is the average difference between the scale value of the tth stimulus and the scale value of each of the stimuli with which it was compared. Because of the manner in which the scale was originally constructed, the Z, values may then be transformed by use of Normal Tables, G(Z,), into a percentage of the area under the Normal Curve. The G(Z,-) values may then be normalized and used as the decision-maker deems appropriate. TESTING THE SCALE VALUES The reliability of these subjective estimates of the significance of the parameters is dependent upon the internal consistency of the data. A test exists for checking this internal consistency and con- sists of starting with the final scale values and working backward to the theoretical percentages de- manded by them. The discrepancies, designated by Ay, are compared with the mean of the absolute values of all of the discrepancies. This, then, is assumed to be the deviation about the expected py. The average deviation is computed by: VIA I Average Deviation = — tt^K where N = number of discrepancy calculations. The average deviation has several advantages: it is easy to define, easy to compute, includes all the variates, and gives the appropriate weight to extreme cases. The disadvantage of using it is its unsuit- ability for algebraic manipulations, since signs must be adjusted in its definition. Also, if the means and mean deviations of two sets of variates are known, there is no ready method for calculating the mean deviation of the combined set without going back to the original data. At this point in the analysis, a test for the internal consistency of the judges' estimates has been stated. The evaluation of the inconsistency calls for a judgment on the part of the analyst as to the sever- ity of the inconsistency. The possible consequences in terms of human lives, dollars, or whatever units of value are involved plays a most important role in this judgment. If there are significant discrepancies, the procedure should be reevaluated to determine the possible causes and possibly point out new or different means of gathering data. There are several avenues that can be pursued from this point: the judges can be asked to once again rank the possible parameters; a new group of judges can be consulted to do the ranking; or, at the worst, the entire analysis can be discarded with the decision maker being THE VALUE STATEMENT 511 left no worse off than he was when he was operating under complete uncertainty. This assumes that the time and money invested are not significant. If, on the other hand, there are no significant discrepancies, a tool exists for the assignment of significance to the parameters. HYPOTHETICAL EXAMPLE As a simple hypothetical example, consider an airplane which the government desires to have produced. Also, it has been decided that an incentive-type contract will be negotiated with the supplier. However, exactly which parameters will have incentives placed upon them and the relative importance of each parameter have not yet been determined. It is necessary to determine the significance of each parameter. Ten individuals are available who qualify as experts based on their past experiences, etc. The parameters are as follows: S I Airspeed. C III Combat Ceiling. R II Combat Range. P IV Payload carrying capability. The ten people were asked to rank the four parameters subjectively in order of significance. The results are shown in Table IV, where a rank of 1 is best. Matrix A of the computational procedure for this example is shown in Table V. Table IV. Rank Orderings Submitted by the Judges Parameters Judges S I R II C III P IV 1 3 2 1 4 2 1 2 3 4 3 3 1 2 4 4 1 2 3 4 5 3 1 2 4 6 3 1 2 4 7 3 2 4 1 8 3 4 1 2 9 2 4 1 3 10 2 1 3 4 I Rank 24 20 22 34 Average Rank 2.4 2.0 2.2 3.4 Table V. Matrix A: Number of Times Parameter i Judged More Significant Than Parameter y Parameter i Parameter,; S I R II C III P IV S I X 4 4 8 R II 6 X 7 7 C III 6 3 X 9 P IV 2 3 1 X 512 M. E. NIGHTENGALE For example, Judge No. 7 ranked the parameters as to their significance in the following order: P IV R II S I C III Matrix P and Matrix Z for this example are given in Tables VI and VII. TABLE VI. Matrix P: Percentage of Times Parameter i Judged More Significant Than Parameter j Parameter ; Parameter j Row totals S I R II C III P IV S 1 X 0.400 0.400 0.800 1.600 R II 0.600 X 0.700 0.700 2.000 C III 0.600 0.300 X 0.900 1.800 P IV 0.200 0.300 0.100 X 0.600 Table VII. Matrix Z: Basic Transformation Matrix Parameter i Parameter j Total Mean (Z) S I R II C III P IV S I -0.25334 -0.25334 0.84161 0.33493 0.08373 R II 0.25334 0.52441 0.52441 1.30216 0.32554 C III 0.25334 -0.52441 1.28155 1.01048 0.25262 P IV -0.84161 -0.52441 -1.28155 - 2.64757 -0.66189 The necessary values to compute the relative significance are shown in Table VIII. Table VIII. Calculation of Relative Significance Parameter i z C(Z) Normalized relative Significance S 1 0.08373 0.53356 0.2647 R 11 0.32554 0.62761 0.3115 C III 0.25262 0.59972 0.2977 P IV -0.66189 0.25403 0.1261 In order to check the internal consistency of the judges, it is necessary to begin with the final scale values and reverse the computational procedure. The difference represented by (Z-,—Zj) is transformed THE VALUE STATEMENT 513 by using the Normal Tables into G(z), which is the percentage of times parameter i theoretically should have been judged to be more significant than parameter;'. Table IX. Consistency Check Z| — Z|| = 0.08373- 0.32554 = Zi-Zj P (computed) -0.24181 0.40447 Z\ — Z|n = 0.08373 - 0.25262 = -0.16889 0.43295 Z| — Z| V = 0.08373- (-0.66189) = 0.74562 0.77205 Z\\ — Zin = 0.32554- 0.25262 = 0.07292 0.52906 Z\\ ~z n = 0.32554- (-0.66189) = 0.98743 0.83828 Z»i~Z iv = 0.25262- (-0.66189) = 0.91451 0.81977 The difference between the computed percentage of times parameter i should have been judged more significant than parameter j and the observed percentage of times parameter i was judged to be more significant than parameter j is designated as A,j. Each Ay is compared to the mean of the absolute values of all of the differences. This is assumed to be the deviation about the expected Pij. TABLE X. Deviation Calculations AI-II = 0.40447-0.400 = 0.00447 A I- HI = 0.43295-0.400 = 0.03295 AI-IV = 0.77205-0.800 = - -0.02795 All— HI = 0.52906-0.700 = - -0.17094 AII-IV = 0.83828-0.700 = 0.13828 A1II-IV = 0.81977-0.900 = - - 0.08023 S|A«| = 0.45482 Average deviation = 2|A,,| N = 0.45482 6 = 0.07580 The largest absolute discrepancy between observed and calculated values is 0.17094. This dis- crepancy is less than three average deviations which indicates that the data is reliable; that is, the judges were consistent among themselves and the assignment of the values was not unjustified. SUMMARY The suggested technique presents the decision maker with yet another tool to be utilized in his quest for assistance. A decision maker with an important decision to make under conditions of un- certainty would surely welcome any assistance. It has been demonstrated that the method of Rank Order can be used to change a decision under uncertainty to a decision under risk; that is, from a decision with no probabilities known to one where some measure of the probabilities involved has been determined. Morris [3] has a very good coverage of the principles of choice which may be applied to such decisions. In cases which qualify as decisions under uncertainty, this method can be applied. If there is better information available it should be used, but in any case, all the information should be considered. The use of subjectivities such as the above is discussed extensively in Edwards [1] and in Fishburn [2]. 514 M. E. NIGHTENGALE There are several facets of this technique that deserve further consideration. Perhaps the distri- bution assumed should be one other than the Normal; perhaps a £ -distribution would be more appro- priate. This would have to be examined closely by the analyst. In the development of PERT a weighting factor obtained from the Beta distribution was applied to the estimates because the judges consistently estimated too low. Maybe a factor of this type would be applicable here. A record should be maintained, both individually and collectively, of the judges' predictions and the subsequent results. Then a weighting factor could be introduced later in future judgments based on past data. Undoubtedly, there are many possible modifications to the technique presented in this section. Further study may prove very rewarding since the use of subjective judgments in aiding decision making is quite an untapped, though potentially rewarding, field. REFERENCES [1] Edwards, W., "The Theory of Decision Making," Psychological Bulletin 51, 380-417 (1954). [2] Fishburn, P. C, Decision and Value Theory (John Wiley and Sons, Inc., New York, 1964). [3] Morris, W. T., Engineering Economy (Richard D. Irwin, Inc., Homewood, Illinois, 1960). [4] Nightengale, M. E., "An Approach to Decisions Under Uncertainty," Arizona State University Industrial Engineering Research Bulletin No. 1 (1965). [5] Nightengale, M. E., "A Systems Approach to Strategy Formulation When Decisions Must be Made Under Conditions of Uncertainty," Ph. D. Dissertation, Arizona State University, 1966. [6] Thurstone, L. L., The Measurement of Values (University of Chicago Press, Chicago, 1959V A QUEUEING PROCESS WITH VARYING DEGREE OF SERVICE Meckinley Scott Mathematics Department University of Alabama ABSTRACT This article describes and presents some results of the analysis of a queueing process where the degree of service given to a customer is subject to control by the management. The control doctrine used is based on the queue length and also on the recent history of the system. In some queueing situations the total service given to a customer may be considered as consisting of two or more stages. In fact, in any queueing process for which we can assume that the service time has an Erlang distribution of order k, then the servicing operation may be regarded as consisting of A: stages (regardless of whether or not these stages have a physical meaning). When stage 1 of service is complete the customer passes immediately to stage 2 and so on. There are situations in which it is necessary to distinguish between the various stages of service, for example, (i) those which are es- sential and (ii) those which are not. In such a case, from the management point of view, it may be desir- able to dispense with the non-essential stages of service whenever there is a large accumulation of customers in the queue.* Thus, the degree (extent) of service given to a customer is no longer fixed, but a variable subject to control by the management. Since these types of problems have not been dis- cussed anywhere in literature, the purpose of this note is to introduce and present some results of the analysis of a simple queueing model of the above character. DESCRIPTION OF THE MODEL Consider a queueing process in which the total service is completed in two stages, one to be followed immediately by the other. Starting from an instant in which the system is empty, the queue length increases to a certain prescribed number/? for the first time. At this point, the mechanism for the second stage of service is shut off until the queue length drops to a value r(0 =£ r ^ /? — 1), at which time both stages of service again become available. When the queue length increases again to R (irrespective of whether or not, in the mean time, the system became empty) the same process is then repeated. This policy does not, however, affect a customer whose second stage of service has already begun. That is, unavailability of the second stage is to be understood in the sense that service of a customer already in the second stage of service remains uninterrupted until completed regardless of the number of arrivals for the duration of his service time. Furthermore, it is assumed that unfortunate customers who receive only the first stage of service do not return later for the second stage. The above type of control on the service mechanism gives rise to the phenomenon of hysteresis. The availability of the total service is no longer uniquely determined by the realized queue length. k In this study the phrase "customers in the queue" is used in the sense that the one being served is also included. 515 516 M. SCOTT Rather the availability of the total service depends on the previous history of the system, that is, on the path by which it arrived at the present state. The waiting room is assumed to have a capacity N(N 2= R) , and the queue discipline is first-come, first-served. t Customers arrive into the system according to the Poisson law with intensity k. The lengths of service for the first and second stage are independent random variables each following exponential distribution with mean durations (llp-i) and U/^2) respectively, and are independent of the arrival process. CHARACTERIZATION OF THE STATES IN THE QUEUE Associated with the service availability at time, t, is an indicator variable, Zt, which assumes say, a value 1 when the second stage of service is available and a value when it is not. Let Xt denote the queue size at time t; then from the description of the model above, it is clear that if X t ==£ r then Z(= 1 and if X, 3=/? then Z, = 0, but if r <X t < R then Z t may be either 1 or 0. Also, when at time, t, there is at least one customer in the queue, let Y t be another indicator variable which assumes say, a value 1 when the customer being served is in the first stage of service and a value 2 when he is in the second stage. Let Eo and E\ denote, respectively, the sets of states in which there is no customer and at least one customer in the queue. A typical state of the set E\ may be characterized by means of a triplet (n, i, j) where n refers to the queue size and i and 7 the values of the indicator variables, Y t and Z t , respectively. Of interest, are four mutually exclusive and exhaustive subsets of E x given by, say Ai-.{(n, i, 1); l^n^R-l}, (i = l, 2), fii:{(», 1,0); r+l^n^N}, B 2 :{(n, 2,0);R^n^N}. Let A— A 1 U A 2 and B = B\ U fi 2 • Then A describes the set of states where the second stage of service is available and B is the set of states in which it is not. As explained earlier, note that unavailability of the second stage of service does not affect a customer whose second stage of service has already begun. The states of the set Z? 2 describe this situation. The Flow diagram, Figure 1, helps illustrate the way in which the queue operates. STEADY STATE PROBABILITIES AND GENERATING FUNCTIONS Let pi(0) denote the probability that at time t, there is no customer in the queue. Also, let pt(n, i) and qt(n, i), (i= 1, 2) denote the probabilities that at time t, there are n customers in the queue and that the system is in set A\ and set B\, respectively. That is, p t (n, i) = Pr \X t = n, Y, = i,Z,= l], (i=l,*2; » = 1, 2, . . .,R-1), q,(n, l) = Pr [X t = n, F,= 1,Z, = 0], (n=r+l,r + 2, . . .,N), qt(n, 2) = Pr [X t = n, Y, = 2,Z, = 0], (n = R,R + l, . . .,N). The steady-state probabilities corresponding to Pt(0), pi(n, i) , and qt(n, i) will be denoted by p(0), tThe mathematical development of the process in this study does not specifically require the assumption of first-come, first-served queue discipline. QUEUEING WITH VARYING DEGREE OE SERVICE 517 (r* 1,1,0) (r + 2,1,0 N-1,1,0) L^vlfN-IAO) (N.1,0) J ^4(N.2J0) Figure 1. Flow diagram showing queue operation (in the triplet (n, i,j), n refers to the queue size and i andj to the values of the indicator variables Y, and Z,, respectively) p(n, i) , and q(n, i) , respectively. The steady-state equations of the system can easily be derived in the usual manner and, therefore, will not be presented here. It can be shown that equations pertaining to sets B\ and Bi yield the following solutions: (1) q(n,l)= )~ P " [ q(r +1,1), (n = r+l, r+2, . . ., R-l), (1— Pi) (2) q(R, D = (1 ( ~_ P ' j- q{r+l,l)- Pl (l-a)p(R-l,2), (l-p?- r ) (3) q(n, 1) =p»- H V, Hl / q(r+l, 1) (1-Pi) + (a-pi) [a"-' t + i -p"r' i (a-p i + ap l )]p{R-l,2), (n = R + l,R + 2, . . .,N-1) (4) (5) (6) q(N,l)=p»-" {1 n P " ' } g(r+l, 1)+ PT [aV-« + '-p i v-«-.(a-p, + ap 1 )]p(«-l,2), (1— Pi) (a - Pi) 9 (n, 2) = a"-* +, p(/?-l, 2), (n = /?,/? + l, . . ., iV— 1) q(N,2)=p i a*-' i p(R-l,2), where (7) (8) p, = X/p, ^ 1, p> = klp-,, 518 M. SCOTT (9) a = k/{k + p 2 ) *p u Throughout the rest of this paper it will be assumed that p\ ¥=■ 1 and a ¥" p\. The cases where pi = 1 and a = p\ can be obtained as limiting cases or by solving the equations directly. Now introduce the following generating functions: (10) P(u, i)= £ u*p(n, i), (i=l,2). n=l Explicit solutions of the equations pertaining to sets A\ and A 2 seem very difficult to obtain, but it can be shown that (id P( "< 0= ^Aur (i=1 ' 2) ' where A(u, 1) = (ku-k- p*)[ku(l- u)p(0) + ku H+l p(R-l, l)-fi!U r+1 q(r+l, l)]-\fizU R p(R-l, 2), (12) b(u,2) = k(ku-k-p i )u K+1 p(R-l,2)-p l [ku(l-u)p(0) + ku R+i p(R-l,\)-piU l + l q(r+l,l)], (13) and (14) A(u) = \ 2 (u-l)(u-U!)(u-u 2 ), where (15) ut= [(\ + p, + p 2 )T V(A + p 1 + p 2 ) 2 -4p,p 2 ]/2X, (£ = 1,2), with U\ having the negative sign in front of the radical. The probabilities of the sets A x and A 2 can be obtained by considering Eqs. 11 through 15, and are given by (16) P(A 1 )=—^—j [p.p(0)+p,(l-p 2 )p(/?-l,2)-(/?-rMr+l, 1)] and (17) P(A 2 )=j^-^[p 2 p(0)+p 1 p 2 p(R-l,2)-(R-rHp*lpi)q(r+l, 1)], respectively, where p = pi + p 2 ¥■ 1. . The probabilities of the sets B\ and B 2 can be obtained from Eqs. 1 through 9 and are, respectively, given by (19) P(B 1 ) = [(R-r)(l-p 1 )-pt R+1 (l-pf- r )] g n + 1 \V [ (P.-P2) ,._„„ (a-pi + qpi) ._„„ (l-p 2 ) ] + Pl l(a- Pl ) a + (l-p,)(a-p,) Pl (l-p.)J QUEUEING WITH VARYING DEGREE OF SERVICE 519 and (20) P(B 2 )=p,p(R- 1,2). It can also be shown that the expected queue length L say, is given by (21) L = L(A)+L(B), where 1 L(A) = (22) and L(B) = '23) d-p) 2 (p-piP2)p(0)- ( * r) {pi + piP* + pi+lp(l-.p)(R + r+l)}q{r+l, 1) pi *■ + p 1 {p-p l p 2 +R(l-p)}p(R-l,2) ±(R-r){R + r+l)+ ^ |(/?-r)-(l- P r')(/V + r ^)p, v - /< } g(r+l,l) + U-Pi) /?p2(a-p 1 + p 1 a 2 )+P!(a-pi + ap.)(l-a v - w )+iVp 1 (p,-p2)a' v - ft + 1 (1-Pi) (a-pi + ap,) R + 1 1-Pi /V + r^)>H p(ft-l,2) (a-pi) In fact, L(^) is the contribution to the expected queue length L from the se\.A=A\ U /4-2 and similar remarks apply to L(B). So far, all quantities have been expressed in terms oi p(0), p{R — 1, I) , p(R — 1,2) and g(r + 1, 1); these can be determined by solving the following set of linear equations: (24) (25) (26) d-p) P(0) kp(R-l, l)+Xp(K-l,2)-p,g(r+l, 1)=0, (u ( -l)p(0) + (u[-uf)p(/?-l,l) + «r+ p-ip-i R- ' (kuj — k — p-,) p(/?-l,2)=0, (i = l, 2), p 2 (l-p 2 ) ^ p,(p,-p 2 ) aX _ R+] (l-pO(l-p) (a-p,) (q-pi + api) «_ fl+2 P(«-1,2) (l-p 1 )(a-p.)" 1 1 f («-r)p 2 (l-pf-0 ,._,., (1-p,) lp,(l-p) + ' (1-p,) o(r+l,l) = l. Equations 24 and 25 above are derived from the condition that in P(u, 1) the zeros of the numerator coincide with the zeros of the denominator* and equation 26 is obtained from the normalizing condition for the sum of all probabilities. ""An equivalent set of equations is obtained by considering t*(u, 2). Equation 24 may also be obtained from equations of the set B. 520 M. SCOTT Tables 1 through 4 give some numerical computations of p(0), P(Ai), P(Bt), (i=l, 2) and L. Table 1. yv=10, R = 6, a = 1, /x, = 2, /jl 2 = 3 r P(0) P(A t ) P(A 2 ) P(B>) P(Bt) L 0.2469 0.3806 0.2524 0.1188 0.0013 2.1038 1 0.2361 0.3968 0.2631 0.1026 0.0014 2.1510 2 0.2265 0.4113 0.2727 0.0881 0.0015 2.2282 3 0.2178 0.4244 0.2812 0.0749 0.0017 2.3339 4 0.2095 0.4368 0.2891 0.0624 0.0021 2.4736 5 0.2012 0.4494 0.2967 0.0497 0.0029 2.6667 Table 2. N= 10, R = 7, X=l, ^ = 2, /a 2 = 3 r P(0) PG4.) P(A 2 ) P(B t ) P(B 2 ) L 0.2321 0.4031 0.2678 0.0961 0.0009 2.2551 1 0.2243 0.4149 0.2756 0.0843 0.0010 2.2920 2 0.2172 0.4256 0.2827 0.0735 0.0010 2.3516 3 0.2107 0.4354 0.2891 0.0637 0.0011 2.4315 4 0.2047 0.4445 0.2950 0.0545 0.0013 2.5320 5 0.1990 0.4533 0.3006 0.0456 0.0016 2.6585 6 0.1930 0.4624 0.3060 0.0364 0.0022 2.8285 Table 3. N=12,R = 6, X= 1, ^ = .95, /jl 2 = 10 r P(0) P(A t ) P(4t) P(B t ) P(B 2 ) L 0.0450 0.1308 0.0124 0.8118 0.0000 6.8329 1 0.0425 0.1619 0.0154 0.7802 0.0000 6.8613 2 0.0403 0.1923 0.0182 0.7492 0.0000 6.9021 3 0.0384 0.2221 0.0211 0.7184 0.0000 6.9533 4 0.0367 0.2517 0.0239 0.6877 0.0000 7.0153 5 0.0353 0.2810 0.0266 0.6570 0.0001 7.0921 Table 4. N = 12,/? = = 7, k = 1, /LI, = .95, fJL2 = 10 T p(0) P(Ax) P(A-i) P(Bi) P(B-i) L 0.0436 0.1552 0.0147 0.7865 0.0000 6.8743 1 0.0411 0.1870 0.0178 0.7541 0.0000 6.9074 2 0.0389 0.2182 0.0207 0.7221 0.0000 6.9521 3 0.0370 0.2491 0.0236 0.6902 0.0000 7.0069 4 0.0354 0.2799 0.0266 . 0.6582 0.0000 7.0701 5 0.0339 0.3107 0.0295 0.6258 0.0000 7.1421 6 0.0327 0.3418 0.0324 0.5930 0.0001 7.2271 AVAILABILITY OF SECOND STAGE OF SERVICE For the queueing process under consideration it is clear that the most interesting characteristics are those concerned with the availability or unavailability of the second stage of service. The proba- bility of the set B given by QUEUEING WITH VARYING DEGREE OF SERVICE 521 (27) P(B)=P(B i )+P(B 2 ) may be interpreted as the fraction of time the second stage of service is unavailable, whereas (28) \ - P(B) = P (0) + P(A) = p(0)+P(A 1 )+P(A 2 ), is the fraction of time it is available. In some instances, it may also be important to know how often the mechanism for the second stage of service is switched off or switched on. Let / denote the expected frequency per unit time the mechanism is switched off. Under steady-state conditions, / also equals the expected frequency per unit time the mechanism is switched on. In fact, Eq. 24 reflects this property of the system. We have then (29) I^mqir+1, 1) = X{p(R-l, 1) +/>(/?- 1,2)}. A FIRST-PASSAGE TIME PROBLEM This section deals with the distribution of time that elapses before the system enters the set A starting initially from one of the states in the set B. Let f n ,i(t) , (i=l,2) denote the probability density function (p.d.f.) of the required distribution, given that initially the system was in a state (n, i, 0) of the set B. A little reflection shows that//} >2 (0 is identical with the p.d.f. of the time that elapses from the instant the system enters the set B from the set A until it reenters the set A (that is, from the instant the service mechanism for the second stage is switched off until it is switched on again), given that entry into the set B is made through a state (R,2,0). Similar remarks apply to the p.d.f. /r, i (r). A little further reflection shows that/„,i(0, (n^r+1) is the same as the distribution of time, in an M|M| 1 queue with capacity N, that elapses before the queue size drops to a value r starting initially from a value n. If we denote by f*,i(s) for the Laplace transform of the p.d.f. f n , ,-(0, then using the same kind of argument as in (4), it can be shown that (30) f* tl (s) = kia? + k 2 a%, (n=r+l,r+2, . . .,/V), (31) f* 2 (s) = 6, A.ar 1 — tHH + fe«V — rHr^ [ 1 — biOCi l — biOCi J + b 3 bg- n {kia*>- 1 + k 2 a$- 1 }, (n=R, R + l, . . .,/V), where (32) a 1 = —{(k + fji l + s) + V(\ + /Ltl + 5)2-4X^1,}, (33) a 2 = ^{(\ + /Lt, + s)- V( \ + M , + s) 2 -4\/li,}, (34) b l = f JL>/(k + (JL2 + s), (35) b 2 = kl(\ + fjL 2 + s), (36) b, = fji 2 l(vL 2 + s ), 522 M. SCOTT c 1 af-'"Ha i -c 3 ) (37) kt = (38) fa = a 2 r+1 { a, v - r - 2 ( 1 - c 2 a 2 ) (o, - c 3 ) + a% ~ r ~ 2 ( 1 - c 2 a, ) (c 3 - a 2 ) } ' Cia 2 v '~ r_2 (c 3 — a 2 ) aJ +1 {a'J r_r " 2 (l - c 2 « 2 ) (a, - c 3 ) + a£- r - 2 (l - c 2 «i) (c 3 - a 2 )} ' (39) Ci = /i 1 /(a.+ ^ + s), (40) c 2 = A/(A + ^, + s), (41) c 3 = At,/(Ati + 5). The expected value E{T„,i} say, corresponding to the p.d.f. f n ,i{t) is given by (42) E{T n ,i}= } n ~ r \ - 1 ^-— {(\//x,)^-"-(X/Mi)' v - r }. (n = r,r+l,. . . , /V) , and (/Lti — A.) (/Ai — \) 2 (43) F{r„, 2 } = £{r„_ 1)1 }+- — ^--{(/x,/^) -7^—-^ -(x//x.) v -" +1 ] + ^+Aif k - i](-A_f +1 f (»=*,* + ! N). (/Lti — A.) LjLt!(\ + jLl2 — jLti) /Lt-jJ \K + fJ.2/ Notice that when A and /Xi are fixed and /i-2 - *■ °°, then £{^,2}^ E{T„^i, 1}, a result which is intui- tively obvious. Furthermore, when />ti = /1.2 then ^ITn^} = E{T n , 1}, (n = R,R + 1, . . .,N). DISCUSSION In this paper, we have presented expressions for several important characteristics of a queueing process where the degree of service given to a customer is subject to control by the decision maker. A knowledge of these characteristics enables us, in principle, to construct a reasonable objective function based on costs originating from three sources: (a) a cost proportional to the queueing times of customers, (b) costs associated with the two stages of service, and (c) costs associated with switching the mechanism for the second stage of service. In any given situation where the parameters of the serv- ice times and arrival time distributions are fixed and the objective function defined, the value of the objective function may be computed for any given set of the control variables r, R and N. Needless to say, the problem of deriving optimal sets of the control variables is one of considerable complexity and we will not pursue this subject any further. In a certain sense, the model discussed in this paper may also be considered as a queueing process involving a variable number of servers when it is assumed that the two stages of service are operated by two different servers. Queueing processes involving a variable number of servers have received some attention in literature, see for example (2); however, in those models every customer is given the same degree of service and two or more customers may be served simultaneously, whereas this is not so in the present case. REFERENCES [1] Cox, D. R. and W. L. Smith, Queues (Methuen and Wiley, 1961). [2] Moder, J. J. and C. R. Phillips, Jr., "Queueing with Fixed and Variable Channels," Operations Research, Vol. 10, No. 2 (Mar-Apr., 1962). [3] Morse, P. M., Queues, Inventories and Maintenance (John Wiley and Sons, New York, 1958). QUEUEING WITH VARYING DEGREE OF SERVIGE 523 [4] Naor, P., "On a First-Passage Time Problem in Chemical Engineering," University of North Carolina, Institute of Statistics, Mimeo Series No. 400 (Aug. 1964). [5] Saaty, T. L., Elements of Queueing Theory (McGraw-Hill, New York, 1961). [6] Yadin, M. and P. Naor, "On Queueing Systems with Variable Service Capacities," Nav. Res. Log. Quart. 14,43-53(1967). MAD: MATHEMATICAL ANALYSIS OF DOWNTIME Edward B. Brandt The RAND Corporation Washington, D.C. and Dilip R. Limaye Decision Sciences Corporation Jenkintown, Pa. ABSTRACT The MAD model presents a mathematical treatment of the relationship between air- craft reliability and maintainability, system manning and inspection policies, scheduling and sortie length, and aircraft downtime. Log normal distributions are postulated for sub- system repair times and simultaneous repair of malfunctions is assumed. The aircraft downtime for maintenance is computed with the distribution of the largest of k log normal distributions. Waiting time for maintenance men is calculated either by using a multiple- channel queuing model or by generating the distribution of the number of maintenance men required and comparing this to the number of men available to determine the proba- bility of waiting at each inspection. INTRODUCTION The maintenance characteristics of alternative aircraft designs should be properly represented in cost-effectiveness trade studies through their impact on both cost and effectiveness. The cost impact of greater maintenance requirements is recognized through higher manning tables, increased costs of ground support equipment, etc. However, maintenance also influences measures of effectiveness. The traditional approach in aircraft reliability analyses relates malfunctions to flight hours. Because downtime is influenced partly by the number of malfunctions and the number of malfunctions depends on utilization, availability decreases with higher utilization. Therefore, downtime determines either the utilization consistent with a specified availability or the availability consistent with a specified utilization. Both of these measures are important parameters in analyses of effectiveness. The MAD model relates the downtime of an aircraft to the component or subsystem repair times, maintenance occurrences, inspection philosophy, flight scheduling, sortie length, and the number of men assigned to the maintenance facility. The model can be used to: • Determine the dowtime implications of changes in reliability (R) and maintainability (M). • Evaluate the impact of additional maintenance men on aircraft waiting time and determine the optimum number of maintenance men. *Research on this paper was performed by the authors at the Vertol Division, The Boeing Company, Philadelphia, Pa. 525 526 E. B. BRANDT AND D. R. LIMAYE • Determine the effect of alternative inspection philosophies on downtime. In particular, the model is sensitive to the flight hours between inspections and the amount of deferred maintenance for each type of inspection. • Evaluate the surge capability of the system; i.e., the ability to fly more than usual for a given period of time. GENERAL DESCRIPTION OF MODEL The model comprises several subroutines. The overall flow diagram of the model is shown in Figure 1. ELAPSED MAINTENANCE TIMES PER ACTION MAINTENANCE ACTION RATES FLIGHT HOURS BETWEEN INSPECTIONS NUMBER OF INSPECTIONS PROBABILITY Or DETECTING FAILURE AT INSPECTION NUMBER OF MAINTENANCE MEN NUMBER OF OPERATIONAL AIRCRAFT NORS DOWNTIME (D ) ADMINISTRATIVE DELAYS (D„) AVAILABILITY DOWNTIME PER FLIGHT HOUR NUMBER OF WORKING HOURS IN A MAINTENANCE DAY NOT OPERATIONALLY READY SPARES U = UTILIZATION UTILIZATION PROBABILITY OF K MAINTENANCE ACTIONS AT Ith INSPECTION PR(K,I) ,1=1, . ,N DOWNTIME PER FLIGHT HOUR (D=D„*D, >D, ♦ D.+D., UTILIZATION/ AVAILABILITY A = f (U,D,m) EXPECTED MAINTENANCE DOWNTIME DUE TO LONGEST OF K SIMULTANEOUS MAINTENANCE ACTIONS DOWNTIME IN LOOK PHASE OF INSPECTION FOR EACH TYPE OF INSPECTION (D, ) AVAILABILITY AIRCRAFT DOWN- TIME DUE TO MAINTENANCE ACTIONS (D„) PROBABILITY OF WAITING DUE TO INSUFFICIENT NUMBER OF MAINTENANCE MEN EXPECTED DOWNTIME DUE TO WAITING (D ) Figure 1. MAD model flow chart SPECIFIC SUBROUTINES Probability Distribution of Maintenance Actions The model is formulated under the assumption that most failures are repaired at inspections. In other words, the aircraft is not considered to abort its mission and immediately return to base for every failure. Failures causing a mission abort are treated separately and assumed to be repaired individually. All other maintenance actions are located and repaired at scheduled intervals such as preflight, postflight, daily, and periodic inspections; multiple actions can be repaired simultaneously when the required maintenance men are available. The Poisson distribution describes the number of failures occurring at each inspection. The funda- mental assumption underlying the use of a Poisson distribution is that individual failures are inde- pendent. Given independence, it can be proved that, for a complex system of many components, the MATHEMATICAL ANALYSIS OF DOWNTIME 527 probability distribution of the time between actions is exponential, regardless of the individual failure law for each component [1]: The distribution of the number of actions in a specified time period is Poisson when the time between actions is exponential. Therefore, the probability of x failures occurring at the ith inspection is given by Hj(x) = ; , where A = failure rate (occurrences per flight hour), and U = flight hours between inspections (i — 1) and i. It is not assumed that all inspections are so thorough that every failure will be detected and then repaired. For example, a certain inspection may not check for transmission leaks or hydraulic pressure. Additionally, some failures are not flight-critical and are deferred to later inspections. All failures not repaired at the inspection directly after their occurrence are considered undetected or deferred. These failures are noted and examined at succeeding inspections until there is a major inspection in the inspection cycle, at which time all previously undetected actions are detected and repaired. The probability of repair at the ith inspection (Pi) can be derived either empirically or on the basis of the inspected components and their relative occurrence. The probability distribution function of repairing k failures at the ith inspection, given that x have occurred, is binomial: (1) Pi(kix) = ( x k )p*(i-Pi)*-K Because the distribution of failures occurring is Poisson, the marginal distribution of the number of actions repaired at the ith inspection is (2) P i (k) = ^P i (klx)P i (x) x = k {Pi\ti) k e-''^'i k\ The distribution of the number of maintenance actions repaired at the ith inspection is therefore Poisson with mean Pjktj. The probability distribution of the number of failures that are not repaired is Poisson with mean (1 — Pi)\tj. In the notation defined above, the unrepaired failures for the first inspection (U\) are (3) £/, = (^-jA) *>'■ where rh\ = expected number of actions repaired at inspection 1 = P\kt\. 528 E. B. BRANDT AND D. R. LIMAYE The available actions at the second inspection include the unrepaired failures from the first inspection plus any failures occurring between the first and second. The probability of repair at the second inspection is applied to all available actions. Therefore, 2 = Pi | [ p ) m l + kt * (4a) In general, (4b) mi and £/ 2 = ( — n — ) m *- m» = P n \ ( — p — — | m„-i f \t„ and U„ = ( —p — - ) nin. Consider a constant injection of new actions occurring at a rate X per inspection and a constant prob- ability of repair P at each inspection. The expected number of unrepaired failures at the nth inspection U n is given by: (4c) Un=f,a-pyx- 2=1 As n becomes large, (4d) Lim Un=(^r-)x 0<P^l, and the expected number of repaired actions is (4e) Limm n =pl(j : jr~)x+x\=X. In cases where there is a major inspection (re+1) in the inspection cycle where all previously unde- tected actions are detected and repaired, we have P n +i = 1 and therefore, TOn+l = j( — p — " j m„+A.i*„ + i > and £/„ + ! = ( _ J m n + l = 0. The succeeding (n+1) inspections will then follow the same pattern as inspections 1 to (ra+1), and thus a steady-state cyclical pattern evolves. Equations (4a) through (4f) imply an assumption that an undetected failure is as likely to be detected in a succeeding inspection as a new failure. This assumption is not critical to the computation of down- time. When the total number of inspections in the inspection cycle is small and the probabilities Pi are large, the total aircraft downtime calculation is not significantly affected if an alternative assumption is made that a failure not detected at an inspection remains undetected until the major inspection. (4h) U 2 =(^- r J -)m 2 + (^- 5 -^)fh l , MATHEMATICAL ANALYSIS OF DOWNTIME 529 In the latter case, the equations for the undetected failures and repaired actions will be given by: (4g) m, = P. 2 \t 2 , >. (4k) m„ = P„\t n , (4 P ) U "=l( 1 -^ 1 )^ (4q) m n + i = A.t„ + i+ ^ I J™-], (4r) U n+l = 0. Distribution of the Longest Maintenance Action The aircraft is out of commission for only the longest individual repair time when maintenance items are repaired simultaneously. It is assumed that the only limitation to simultaneous repair results from insufficient manpower; this is accounted for in maintenance waiting times. Downtime from active maintenance at each inspection (Dj) is given by (5) Di=2 E>-(k)Pi(k), k = l where El(Ic) — expected repair time of the longest of A; simultaneous maintenance actions. The probability distribution of maintenance actions Pi(k) is given by Eq. (2). The expected value of the longest repair time is (6) Edk)=j\f y [F{y)ydy = j~yk[F(y)Y->f(y)dy, where F(t ) = the cumulative distribution function of the time to repair a single action. A numerical solution can be easily obtained if one assumes that each subsystem or component will always be repaired in the same time. However, this assumption is unrealistic, especially when higher levels of agregation are necessary. The chosen method of solution uses a continuous probability density function to describe repair times. By this method, the variability within as well as between subsystems can be properly represented. The log normal distribution was selected to represent the times to repair for the following reasons: 1. The log normal is a flexible, two-parameter distribution which is skewed to the right. The probability of zero repair time is always zero. 2. The log normal distribution has been shown to fit well to reaction times of humans to more complicated perceptual patterns involving some degree of learning [2]. 3. The actual distributions of repair times for individual systems and the entire aircraft compare favorably to the log normal for both helicopters and fixed-wing aircraft. 530 E. B. BRANDT AND D. R. LIMAYE The log normal distribution is given by where cr= standard deviation of y, and y,„ = median of y. If Yu yii • • • •» y» are samples from a log normal distribution, the distribution of the largest, Y*{=MAX(yi,y>, . . . ., y„)},is given by (8) fr*(y) = n[F Y (y)]»-*fy(y), where (9) Fy(y') = I "' — \j= exp{--^ ln« ( X ) \dy. Jo yo"V27r I 2o- \y m /J Substituting this result in (6), we get (10) E L (k)= (' yfy*(y)dy= [" yA[Fy(y) ]*-/»■ (y)dy, Jo Jo where fy(y) and Fy(y) are given by (7) and (9), respectively. The integral in (10) is not easy to evaluate; however, Gumbel [3] has derived approximate results for different values of A;. These results were used in the model. Maintenance Waiting Times This category accounts for the additional downtime resulting from the inability to achieve com- plete maintenance concurrency because of insufficient manpower. It is obvious that waiting times are influenced significantly by the distribution of failure occurrences. Poor planning of the assignment of aircraft or erratic demands for their use can lead to substantial waiting in one period and idle time in another. Two approaches have been programmed with the hope of obtaining a most probable value and a lower limit for maintenance waiting time. The approach which yields the most probable value assumes that aircraft will arrive simultaneously at the maintenance facility with the requirement that the combined actions be processed at once. The second approach assumes perfect aircraft scheduling in which the expected number of actions arriving at the maintenance facility is constant per unit time period. Bulk Arrival of Maintenance Actions — Each aircraft is assumed to go through the identical series of inspections throughout the entire inspection cycle. The inspection cycle may last for a number of periods, after which the aircraft is processed through a major inspection and is considered returned to a new condition. When many aircraft are assigned to the same facility, the combined maintenance actions must be processed together to avoid delays. Each aircraft is assigned a position in the inspection cycle. The distribution for the combined actions is Poisson with mean equal to n A (11) m T = ^ niy, MATHEMATICAL ANALYSIS OF DOWNTIME 531 where fhij = expected number of actions for aircraft j at inspection i (see Eqs. (4a) and (4b)), and Na — total number of aircraft. It is now necessary to find the distribution of the number of maintenance men required, given that k actions are to be repaired simultaneously. Consider the aircraft as a collection of tasks each requiring a particular number of men. It is possible to construct a distribution of the number of men per task. By employing the central limit theorem, the distribution of the average number of men required is normal, given that the number of actions is large. The central limit theorem is applicable here because the region in which manpower shortage exists occurs only at high values of k. Given k actions, the distribution of the total number of men required can be assumed normal with mean ix m , k and standard deviation <r m ,A-, where IJL m , k = kE(A), and tTm.k — O-A'Vk, where E(A) — expected value of number of men per task, and (j a — standard deviation of number of men per task. The aircraft must wait if the number of men required (M) exceeds the number available (Mo). There- fore, the probability of waiting given that A actions have occurred is (12) P w {k)=P[M > M \k] = ( X f M {m)dm JMo = = exp -- \dm. 0- m ,A-V277- JMo I z \ (Tm,k I J The total probability of waiting is P, = ^P w (k)P(k) k kf k\ (r m ,kV2^ I 2\ a m ,k J J This probability can be evaluated at each inspection. To determine the expected waiting time, it is assumed that the repair time for a task is independent of the manloading. Then the waiting time when the number of men required is m is given by (14) W(m) = where ^r= average elapsed repair time. The expected waiting time is then given by ■2.-1 M Rt , ( 15 ) g[r<»)]- f 2 (m *'~" fir-i]«-— J -r"t \A[*=***) t W 532 E. B. BRANDT AND D. R. LIMAYE Continuous Arrivals of Maintenance Actions. This approach uses the standard multiple-channel waiting-time model. The required input information is the arrival rate, service time, and the number of channels. For our problem, the maintenance actions represent the arrivals; the repair time is the service time; and the number of maintenance teams is the channels. The arrival rate is a function of the occurrences per flight hour, flight hours per unit time period, number of aircraft assigned to maintenance facility, and the total maintenance working hours (clock time) per unit time period. The number of channels is equal to the total men assigned to the maintenance facility divided by the average men per action. The service time distribution is given by the distribution of elapsed repair times. The main advantages of the queuing theory approach are its simplicity and the availability of computational procedures to determine solutions. The main disadvantage is the assumptions that are forced upon the problem in order to facilitate the mathematical solution. These assumptions are as follows: 1. Arrival rate — The expected number of actions arriving at the maintenance facility is constant per unit time interval. This assumes nearly perfect scheduling which ensures a constant demand rate for maintenance men. The probability distribution of the time between arrivals is exponential. 2. Service time — The expected service time for an action is equal to the average time to repair (ATTR). The probability distribution of repair times is assumed to be exponential. 3. Numbers of channels — The number of men per task is constant. Under these assumptions, the standard multiple-channel queuing model is applicable and the expected value of waiting time can be easily determined. Let A. = arrival rate, /x — mean service time, s = number of channels, then the expected waiting time is given by Ref. [4]: E((o) =~ sld-p) ' where n , A. , P — density = — , and fJLS P = probability that zero teams are working -Y (- 2^r + ;! »!(1-P) MATHEMATICAL ANALYSIS OF DOWNTIME 533 Comparison of the Two Methods. The bulk-arrival method is the more realistic representation of actual conditions. Furthermore, it is compatible with the assumptions made in other sections of the model because it considers the sortie length and the distribution of men per task. The major disad- vantage is that the number of working hours between aircraft arrivals is required as input. The queuing-theory approach depicts a highly optimistic system wherein maintenance actions are arriving continuously at a constant rate. These assumptions contradict the maintenance downtime routine which assumes the simultaneous repair of tasks when the aircraft undergoes an inspection. Furthermore, the service time distribution is considered exponentially distributed with a constant manloading while the maintenance downtime routine assumes log normal service times with variable manloading. The queuing-theory approach can be refined by assuming other distributions of service times, but then the computational procedures become vary complex. RELATION BETWEEN AVAILABILITY, UTILIZATION, AND DOWNTIME The relationship between availability and utilization has been traditionally expressed as shown in Figure 2. This approach assumes downtime per flight hour to be fixed. AVERAGE UTILIZATION PER DAY DOWNTIME = d, DOWNTIME = d. d 2> d l AVAILABILITY Figure 2. Traditional A/U plots The results of the MAD model show a somewhat different picture. Availability no longer responds linearly to average daily utilization. This occurs because some inspections are on a calendar basis, or before and after each flight. When utilization increases, downtime per flight hour increases and forces the curve to bend back because of the additional waiting time accompanying the longer flight program (see Figure 3). The curves designated T\, T 2 , r 3 ,andr 4 represent larger manning levels (T4>T 3 >T2>Ti). AVERAGE UTILIZATION PER DAY AVAILABILITY Figure 3. MAD A/U plots SUMMARY The MAD model incorporates a new approach for the determination of the interrelationship between aircraft malfunction rates and repair times, inspection philosophy, and number of men as- signed to the maintenance facility; and between availability, utilization, and downtime. The model is therefore suitable for evaluating the cost-effectiveness implications of changes in subsystem failure rates and maintenance times. It can also evaluate different inspection philosophies (such as flight-hour inspections versus calendar-day inspections) and determine the optimum number of men required 534 E. B. BRANDT AND D. R. LIMAYE to achieve a specified level of availability and utilization. Such analyses have traditionally been per- formed with detailed simulations of the maintenance environment incorporating Monte Carlo tech- niques. In the MAD model, attempts have been made to represent the significant characteristics of the real-world situation as accurately as possible using probability distributions. The model is thus simple and easy to use, and requires significantly lower running time than Monte Carlo simulations. The level of detail in this model makes it suitable for tradeoffs and sensitivity analyses in the concept- formulation and early contract-definition phases. The MAD model has been programmed and is operational on an IBM 360/65. It has been validated by testing it against actual maintenance data from Army CH-47 helicopters at Fort Rucker, Alabama. The model is now being used for subsystem tradeoffs and sensitivity analyses for the HLH (Heavy-Lift Helicopter) and LIT (Light Intratheater Transport) Programs. REFERENCES [1] Drenick, R. F., "The Failure Law of Complex Equipment," J. Soc. Indus, and App. Math. (Dec. 1960, pp. 680-690. [2] Goldman, A. S., and T. B. Slattery, Maintainability: A Major Element of System Effectiveness (John Wiley and Sons, New York, 1967). [3] Gumbel, E. J., Statistics of Extremes (Columbia University Press, New York, 1960). [4] Hillier, F. S. and G. J. Lieberman, Introduction To Operations Research (Holden Day, 1967). A METHODOLOGY FOR ESTIMATING EXPECTED USAGE OF REPAIR PARTS WITH APPLICATION TO PARTS WITH NO USAGE HISTORY Sheldon E. Haber and Kosedith Sitgreaves* The George Washington University School of Engineering and Applied Science Institute for Management Science and Engineering ABSTRACT In this paper a model is presented which focuses on the difficult problem of pre- dicting demands for items with extremely low usage rates. These form the bulk of repair parts in military systems. The basic notion underlying the model is the pooling of usage data for common design items with movement for the purpose of estimating usage rates for similar items which have shown no movement. A unique feature of the model is that it also makes possible the estimation of usage rates for items newly introduced into a system for which no previous usage history is available. 0. INTRODUCTION The problem of predicting demands for individual repair parts in military inventory systems has received much attention over the last two decades. This problem is a complicated one because of the sporadic nature of demands for military repair parts. For most repair parts, no demands are registered over long periods of time and when items are demanded, they are generally demanded only once or twice. This fact has now been documented by many usage studies t and is once again documented in this study. To illustrate the nature of the demand problem under consideration, usage data for 61 submarine patrols are shown in Table 1. As may be seen from the first entry in the table, no usage TABLE 1. Distribution of 25,138 Different Repair Parts by the Number of Patrols in Which They Were Demanded Number of Frequency of patrols different parts 21,597 1 1,776 2 673 3 333 4 199 5 134 6 96 7 81 8 62 9 32 10 28 11-61 127 Total 25,138 *Columbia University. tFor a review of this literature, see Henry Solomon. "A Summary of Logistics Research Project's Experience With Problems of Demand Prediction" in [1 ]. 535 536 S. E. HABER AND R. SITGREAVES was recorded for the vast majority of items, i.e., the vast majority of items was not demanded in any one of the 61 patrols. Of those that were demanded, one-half were demanded in exactly one patrol. Thus for most repair parts with usage, the problem of estimating usage rates is a difficult one. This difficulty is compounded by an order of magnitude for the bulk of the items whose usage history shows zero units used. In this paper we will be concerned with the estimation of expected usage of repair parts for the purpose of computing shipboard allowance lists. A shipboard allowance list is defined as the range and depth of repair parts to be stocked aboard ship to meet uncertain demands. The range of repair parts refers to the number of different items to be stocked. The depth refers to the number of units stocked of an item. Given that repair part usage is sporadic, several demand prediction strategies are available. The most widely practiced approach is that of employing usage estimates provided by technicians, i.e., supply personnel responsible for provisioning of repair parts. In practice, these have been found to be conservative and lead to relatively expensive stockage lists. Such conservatism, however, is pre- ferred to a much more extreme approach which might assign a zero usage estimate to a repair part until positive usage is experienced. The difficulty with this latter approach is that many repair parts are only one-time movers. Failure to have an adequate quantity of stock aboard ship or in the supply system prior to the first demand can thus lead to a large range of shortages and an unsatisfactory level of readiness. Another approach that has been utilized to estimate usage of slow moving items is exponential smoothing [2]. In this procedure, a technician's usage estimate is generally employed as an initial esti- mate. Hence, the initial procurement for repair parts will be based solely on the technician's estimate and will thus be subject to the limitation already noted.* One procedure for the problem at hand is to utilize information not directly pertaining to the repair part being considered.! This is the approach of this paper. The information used pertains to the class of repair parts of which the given repair part is a member. It is assumed that usage data are available for the repair part class and that some of the items in that class have shown movement in the past. The advantage of this procedure is that it permits the pooling of demands where they have occurred and the use of this information for making positive usage estimates for items for which zero usage has been recorded. The procedure also provides an expected usage estimate for new items being introduced into the inventory system for which no usage history is available. The criterion used in this study for defining repair part class is that of nomenclature of which resistor, washer, motor, and valve are examples. It should be noted that within a given class, estimated usage rates will vary depending on the design, environment, location, etc., of each part. The usefulness of partitioning by nomenclature rests on the assumption that variations in usage rates within a given nomenclature class will be less than that among classes. In this case, the stratification of repair items should reduce the variance of the estimates vis-a-vis the alternative of not distinguishing items by nomenclature class. In the next section we present a theoretical model for estimating expected usage of repair parts, in particular repair parts with no usage history. Following a description of the model, goodness-of-fit *Furthermore, for zero-movers this procedure will quickly lead to usage estimates that are indistinguishable from zero. tA model of this type, which uses information pertaining to the failure behavior of the part's parent component, is described in [5]. ESTIMATING USAGE OF REPAIR PARTS 537 tests are applied. In the final section, the model is evaluated by developing alternative allowance lists and comparing these lists against actual submarine usage data. 1. THE PROBABILITY MODEL We consider a class C of items defined, for example, in terms of nomenclature. Let part / be any item classified as belonging to class C, and let y/ = 0, 1, 2, . . ., represent the total quantity of units demanded for part / is a specified time period T, say a total of T patrols.* We consider a probability model in which the quantity y for a given part (more precisely, y/) is a random variable with a Poisson distribution given by (1) Piy\d)= e ( , ' , y! where 8 (again more precisely, 8i) is the parameter of the Poisson distribution of demands for item / in a unit time period, in our case, a single patrol. t Note that 8 is thus the expected number of units de- manded for part /in a patrol. It is assumed further that demands for part / in non-overlapping periods of time are independently distributed. Our problem is to estimate 8 for any item / classified as belonging to class C. We distinguish two cases. In the first case, we are concerned with estimating 8 for installed items for which usage data, i.e., y values are available. As indicated earlier, the problem here is complicated by the fact that for the majority of items no usage is recorded, i.e., the observed y values during T time periods are zero. In the second case, we would like to estimate 8, that is, the expected usage, for items classified as belong- ing to class C, but being installed for the first time. In this case, no y values, zero or otherwise, are available. In both cases, it seems intuitively reasonable to assume that positive usage data for some members of the class should be useful in determining estimates of the 8 values for the remaining members. We formalize this by postulating that 8 is itself a random variable with a probability distribution over all items in the class C. We then use standard theory to obtain the desired estimates for both cases men- tioned above. In general, if p{8) denotes the probability distribution of 8 in the class C, and p(y, 8) the joint distribution of y and 8, then (2) P (y, 8) = p{y\8) ■ p{8) = p(8\y)- P (y), where p{y) denotes the unconditional distribution of y values for the class C, and p(8\y) the conditional distribution of 8, given y. ** In the first case mentioned above in which the observed y=0, 1, 2, . . .,we estimate 8 by ~8 = E(8\y) from the conditional distribution of 8. In the second case, when y values do not exist, we estimate 8 *For simplicity, we do not distinguish between the random variable y and the values which it assumes. Throughout, we also use the symbol p{ ) to represent different probability distributions. tThe subscript / has been omitted from y and here and in the subsequent discussion to facilitate exposition. **In Bayesian terms. />(0) is the prior distribution, and />(W|y) the posterior distribution. 538 S. E. HABER AND R. SITGREAVES by the value E{6), the unconditional expected value of 6. In this latter case, the estimate of 6 is the same for all new items in the class C, while in the former case 6 varies for each item in class C depending on its y value. In considering possible distributions for 8, we assume that the class C can be extended in such a way that 6 can be treated as a continuous variable with a probability density function. The preponder- ance of y values of zero in most classes led us to consider densities whose maximum value occurs for 6 — 0. We examined first the exponential density 1 -9- p(0)=-e $ for 0<6<*>, but resulting calculations did not give evidence of a good fit. A natural extension, which because of its mathematical properties seemed particularly appropriate for the distribution of a Poisson param- eter, is a two-parameter gamma distribution. Accordingly, we assumed that pw-.»-(|)'^ fur0<9< - with a, /3 > 0. For any value of a < 1, this function is infinite at = 0, and is monotonically decreasing as 6 increases from to °°. From (1) and (3), Eq. (2) can be written specifically as a a0a + y-\ e j3 Ty (4) p(y,e\a,p)=p(y\e)-p(e\a,p)= ^ r{a)yl a+Tp\ a + y 6 a + y~ 1 e g r(a + y) a a (T($)y p } r(a + y) T(a)y\ (a+Tp) a + v = p(0|y, a, (3) ■ p(y\a, /3). Thus, the conditional distribution of 6, given y, also has the form of a gamma distribution, while the unconditional distribution of y for the class C is a negative binomial. From (4) and (3), we find for the first case where y values are available while (6) E(6)=p, for the second case where items are being installed for the first time and there are no y values. For any item in class C with an observed y value, we now have from (5) an estimator for the expected usage for the item, namely 6. The estimator can be evaluated for every y value, including zero, if we ESTIMATING USAGE OF REPAIR PARTS 539 have values for the two parameters a and B. We estimate these parameters from the observed set of y values for the class C, treating these values as a set of independent observations from a negative binomial distribution with mean value TB and with variance TB ( 1 +— ) . Letji,y 2 , . . ., y„ be the observed y values for the n items /= 1, 2, . . ., n in class C. From the data we estimate the mean and variance by 1 " y=-£ y.and i=l 1 " V— _ , ^ (yi — y) 2 , respectively. !=1 We estimate TB by y so that y /3 = ~ In estimating a, we use the method of moments since this is relatively simple and straightforward. / TB\ Since the variance of y is TB I 1 H I, we estimated the variance as * / TB\ y 7)8 I 1 H — t I and with B = f;, from above, obtain a — V-y Hence, in the case where an observed y value is available for a given item, the desired estimate of 6 for the item is T'B a + y and in particular when y = 0, we have a + TB T ~^>0. a + TB tb y For y > 0, as 7 becomes large, the quantity — „ approaches the value 1, and 6 approaches -=,• a + TB T In the case of new items being introduced into the inventory system, the estimated expected usage 2 t part class describing the new item. for the item is given by fi= - where it is seen from (6) that B is the expected value of 6 for the repair 540 S. E. HABER AND R. SITGREAVES 2. EVALUATION OF GOODNESS-OF-FIT In the preceding section, we assumed that the distribution of expected usage for items in a given class C was a two-parameter gamma distribution. This, in turn, led to a negative binomial distribution of demands for items in the class. In this section, we examine the goodness-of-fit of the model just described. It will be recalled that a similar assessment of the earlier exponential model led to its rejection. The purpose here is not an exact test of a particular hypothesis, but rather to determine the reasonableness of the model finally adopted. An additional test of the model in an inventory context is provided in the next section. In examining the goodness-of-fit of the model, a large number of repair part classes were defined on the basis of nomenclature and for each class a. and fi were computed from the available data. Having obtained these estimates, theoretical negative binomial distributions of demands for items in each class were calculated and compared with the actual distributions of y values. The comparison of the actual and theoretical frequencies for each class was made by computing the value of chi-square as an index of goodness-of-fit. Again, it was not the purpose to use each of the chi-squares as a rigorous test of the corresponding null hypothesis. The intent was to utilize the chi-squares and the associated significance probabilities as the basis for assessing the appropriateness of the model. In evaluating the results, the following points should be kept in mind. First, because of the vagaries of reporting, no model may provide a satisfactory fit to the data. For example, extremely large y values may be expected as a result of mispunched data or stockpiling of material. Additionally, demands for repair parts are often for even numbered quantities. The prevalence of demands for even quantities may be seen from the distribution of y values for 61 patrols shown in Table 2. Table 2. Distribution of 25,138 Different Repair Parts By the Total Quantity of Units Demanded During 61 Patrols Total demand No. of different Total demand No. of different quantity " repair parts quantity a repair parts 3 249 14 26 4 249 15 28 5 121 16 27 6 124 17 9 7 86 18 20 8 97 19 14 9 58 20 38 10 61 21 17 11 36 22 13 12 57 23 10 13 29 24 18 "For the total sample of 25,138 different repair parts, items with a total demand quantity of 0, 1, 2 units during 61 patrols were 21,597, 1,027, and 495, respectively. The number of different repair parts with a total demand quantity of 25 or more was 632. Second, the nature of the chi-square statistic itself is such that relatively small differences be- tween observed and expected relative frequencies will load to large chi-squares if the sample is large. For the purpose of evaluating goodness-of-fit within a demand prediction context, this relation is an important one. In performing the goodness-of-fit computations, we were able to determine the significance of chi-square for 54 classes of repair parts containing 18,847 different parts. For the repair classes ex- ESTIMATING USAGE OF REPAIR PARTS 541 amined, no correction was made for the phenomenon of even quantity demands. It was possible, how- ever, to correct for the presence of outliers. Items in a repair part class were treated as outliers and eliminated if the smallest y value omitted was large relative to the largest y value included. In almost all cases the outliers had a very low unit price, or a high total installed population, or large individual demand quantities, or a combination of these characteristics. For example, in the repair part class "filters" containing 370 different filters, one filter had a total demand quantity of 320 units — all units of the item being demanded in a single transaction. Of the 18,847 parts in the sample, this item and 37 other repair parts were eliminated as outliers.* The results of the goodness-of-fit computations, after elimination of the 38 items considered to be outliers, are shown in Table 3. TABLE 3. Summary of Chi-Square Computations Different repair parts in class Number of repair part classes Number of classes with poor fit at 0.05 level 0.01 level 100 or Less 101 to 499 500 and Over 10 30 14 1 7 6 1 3 Total 54 14 4 Over all classes, poor fits were obtained for but 4 and 14 of the 54 repair part classes at the 0.01 and 0.05 levels, respectively. As may be seen from Table 3, the incidence of poor fits increased as the number of repair parts in a class increased. In interpreting the results of Table 3, the earlier observa- tion that where the number of items in a class is large, discrepancies between observed and expected relative frequencies may still be small, should be recalled. Indeed, this was the case for almost all of the repair part classes where the chi-square was larger than expected on the basis of chance alone. 3. FURTHER ASSESSMENT OF THE MODEL In addition to examining the goodness-of-fit of the model, shipboard allowance lists were com- puted using as input the demand prediction model previously described. These lists were then com- pared with an allowance list utilizing technicians' usage estimates, both in terms of dollar investment in stock and shortage counts. The purpose of this evaluation was (1) to simulate the performance of the model in the environment for which it was designed, and (2) to determine whether differentiating repair parts by nomenclature class represented an improvement over a simpler approach of grouping all items into a single class. The data base for an initial test consisted of 61 patrols of usage history. The items included in this initial test fall into the first category of repair parts distinguished in this paper, i.e., items for which usage data are available including data for items with "usage" of zero units. Employing past usage *The total of 38 outliers was concentrated in 16 repair part classes. No class with 100 or fewer different repair parts con- tained any outliers; 10 of the 30 classes with 101 to 499 parts contained outliers; while the remaining 6 classes with outliers came from the 14 classes with 500 or more parts. This distribution is not inconsistent with the plausible hypothesis thai the probability of observing an outlier in a given class increases with the number of different parts in the class. 542 S. E. HABER AND R. SITGREAVES data for the 61 patrols and the demand prediction model, usage rates were computed for each repair part under two procedures: (1) different a and (3 were computed for each nomenclature repair part class (Model II A), and (2) a single value of a and /3 was used for all repair parts regardless of nomen- clature class (Model II B).* Allowance list quantities were then computed for these procedures and the one incorporating technicians' usage estimates (Model I) using the inventory model described in [4]. In all cases the inventory model was used with the same parameters. Thus, the only difference in the com- putation of the allowance lists was the technique used for deriving usage estimates. The allowance list quantities were next compared against usage data during a subsequent 21 patrols. The data for these patrols were not used in the initial calculation of usage rates. After each new patrol the model allow- ance list quantities were updated. No updating procedure was available for the quantities computed using the technicians' estimates. Summary data describing the allowance list computed for items with previous usage history are shown in Table 4. As may be seen, Model I was about three times as expensive as the other two models. In terms of depth or number of units stocked, Model I stocked almost five times as many units as the other models. In terms of range or number of different items stocked, both Models I and II A stocked more items than Model II B. Thus, one effect of distinguishing among repair part classes on the basis of nomenclature was to increase the range of repair parts stocked by the model. Table 4. Range, Depth, and Dollar Value of Investment: Items With Previous Usage History a Model Range of items stocked h Depth of units stocked c Dollar value I II A II B 18.6 18.9 16.0 112.3 25.4 22.5 2,703.1 960.4 854.7 a Averages for 21 patrols. All figures in thousands. b Number of different repair parts stocked. c Number of units stocked. The average range or number of different items with a shortage and the average depth or number of units short per patrol are shown in the first and second column, respectively, of Table 5. It should be remarked that the latter measure is not without difficulty of interpretation due to the problem of mix of different units of measure among items, e.g., some items are measured in feet while others are in units of "each." For the sake of completeness, however, this measure is included as an alternative measure of performance. In Table 5, shortage counts are provided separately for items not stocked and for items stocked. These two categories of stock are distinguished since items in the former category tend to be "not carried" over successive patrols. From Table 5, it is seen that for items stocked, there were on the average 19.5 and 23.1 different items with a shortage per patrol for Models I and II A, respectively. Over all items with a shortage, the total number of units short averaged 172.3 and 225.0. In terms of the number of units short per item short, Model I averaged 8.8(172.3^-19.5) as compared to 9.7 (225.0 h- 23.1) for Model II A. Tor Model II B, a = 0.00787 and /&= 0.02414. ESTIMATING USAGE OF REPAIR PARTS 543 Table 5. Shortage Counts: Items With Previous Usage History 3 Items Shortages: All items Range b Depth c Not stocked: Model I Model II A Model II B 2.5 (2.2) 3.0 (2.8) 6.7 (4.0) 6.8 (13.3) 4.8 (5.1) 12.1 (8.8) Stocked: Model I Model II A Model II B 19.5 (12.8) 23.1 (14.3) 21.2 (13.8) 172.3 (132.2) 225.0 (196.3) 219.6 (195.0) a Averages for 21 patrols. Standard deviation in parentheses. 11 Number of different repair parts with shortages. c Number of units short over all repair parts. A second test similar to the one described above was performed for 4,094 items which were treated as new items being introduced into the system. It should be noted that none of these items were in- cluded in the previous test. Following the model, in developing usage rates for this test, only the param- eter j8 was used. For Model II A, /3 varied from class to class; for Model II B, f3 was invariant for all items. In each case, the /3 value used was the same /3 value employed in the first test. Thus this second test was a more stringent one in that not only were inventory quantities matched against unknown future usage (for 35 patrols), but in estimating item demand distributions the input data were from a completely different set of repair parts. Summary figures describing the allowance lists and shortage counts for items which were treated as new items being introduced into the system are shown in Tables 6 and 7, respectively. The format of these tables is the same as for Tables 4 and 5. Table 6. Range, Depth, and Dollar Value of Investment: Items With No Previous Usage History a Model Range of items stocked b Depth of units stocked c Dollar value I II A II B 3.9 3.7 3.2 18.9 4.4 3.8 450.2 231.8 145.2 a Averages for 35 patrols. All figures in thousands. b Number of different repair parts stocked. c Number of units stocked. From Table 6, one notes that as in the case for items with previous usage history, Model I was the most expensive one. The additional dollar value of investment for Model I was once again accounted for by the large number of units stocked, given that an item was stocked. Likewise, the range of dif- ferent items stocked was least for Model II B. An examination of Tables 5 and 7 indicates that for items not stocked, Models I and II A performed about the same; Model II B performed less well than the other models because of its reduced range of items stocked. For stocked items, however, Model I performed better than the other models; the per- formance of Models II A and II B was very similar. Thus, on the basis of the shortage measures alone, Model I was ranked higher than Model II A because it had fewer shortages for stocked items. Model II A was ranked higher than Model II B because it had fewer shortages for nonstocked items. 544 S. E. HABER AND R. SITGREAVES Table 7. Items With No Previous Usage History Items Shortages: All items Range b Depth 1 Not Stocked: Model I Model II A Model II B 1.0 (1.2) 0.9 (1.4) 6.0 (6.4) 1.7 (2.1) 1.5 (2.0) 10.1 (10.0) Stocked: Model I Model II A Model II B 3.6 (3.5) 6.1 (4.3) 5.3 (4.0) 33.8 (53.6) 48.4 (58.8) 44.6 (57.7) a Averages for 35 patrols. Standard deviation in parentheses. '' Number of different repair parts with shortages. c Number of units short over all repair parts. One should note that the difference in performance between Models I and II A was small. Model I had 3 to 4 fewer items with a shortage per patrol; given a shortage, the number of units short per item short was at most one less for Model I. On the other hand, the difference in investment cost be- tween the two models was substantial. Model I was approximately 2 to 3 times as expensive as Model II A. The difference in performance between Models II A and II B, was about the same as that between Models I and II A. In terms of investment cost, however, Model II B was somewhat less expensive than Model II A. The finding of small differences in performance between models is reinforced by an examination of shortage counts for those repair parts which were highly essential.* Shortage counts for this class of items are found in Table 8. As may be seen, for these items, with the exception of the depth shortage measure for stocked items with no previous usage history, all models performed about the same. Table 8. Shortages for Highly Essential Items Highly essential items 3 With previous With no previous Items usage history usage history Range Depth Range Depth (1) (2) (3) (4) Not stocked: Model I 0.1 0.2 (0) (0) (0.2) (0.7) Model II A (0) (0) . (0) (0) Model II B (0) (0) (0) (0) Stocked: Model I 2.2 19.1 0.1 1.7 (3.4) (27.4) (0.2) (7.8) Model II A 2.4 23.3 0.4 5.1 (3.7) (33.9) (0.6) (11.2) Model II B 2.3 23.2 0.6 5.5 (3.6) (33.9) (1.2) (11.8) 1 Averages for 21 patrols in Cols. 1 and 2; Averages for 35 patrols in Cols. 3 and 4; standard deviation in parentheses. *A discussion of military essentiality coding of repair items is found in [3]. ESTIMATING USAGE OF REPAIR PARTS 545 Based on the findings of this section, we conclude that relative to the substantial difference in investment cost between Model I and Model II A, the difference in performance between these two models was small. Considering cost as well as performance, Model II A was judged superior to Model I. Because of the fewer shortage counts for Model II A vis-a-vis Model II B and the similarity of costs between them, Model II A in which items were distinguished by nomenclature class was judged superior to Model II B where all items were lumped into a single clsss. 4. SUMMARY In this paper a model is presented which focuses directly on the difficult problem of predicting demands for items with extremely low usage rates, which form the bulk of repair parts in military systems. In the model, repair part demands are assumed to be Poisson distributed while their means are assumed to be gamma distributed. A basic notion underlying the model is the pooling of usage data for items that have shown some movement for the purpose of estimating usage rates for those items which have shown no movement. At the outset, repair parts were partitioned into different classes. An assessment of goodness-of-fit was performed for 54 different classes of items to determine whether the unconditional distribution of demands was indeed negative binomial distributed as postulated by the model. Given the vagaries of the data, e.g., disproportionately large numbers of even demands and large outliers due probably to mispunched data and stockpiling of material, the model fit the data quite well. Although the parti- tioning of repair parts was not essential to the model, it was assumed that such partioning would yield improved estimates of usage rates. The goodness-of-fit computation and other tests conducted in an inventory context suggest that this indeed was the case. A unique feature of the model is that in addition to providing positive usage estimates for repair parts with previous usage history, regardless of whether or not a particular part was observed to move, the model also makes possible the estimation of usage rates for new items for which no previous usage history is available. In an inventory context under stringent test conditions, the model performed equally well under both contexts, when compared with the current procedure for estimating usage rates. Mean shortage counts for the model were slightly higher over all items and about equal for highly essential items as mean shortage counts for the current procedure. On the other hand, differences in cost were marked with the current procedure costing two to three times as much as the proposed model. As indicated by this study, the notion of pooling usage data, and from such data extrapolating usage rates for installed items with zero usage of for items being newly introduced into a system, is a useful one. ACKNOWLEDGMENT The authors wish to thank William Hise and Talbot Walls, Jr., for their programming assistance. REFERENCES [1] Astrachan, M. and A. S. Cahn, (Editors) Proceedings of Rand's Demand Prediction Conference, January 25-26, 1962, The Rand Corporation RM- 3358 -PR (1963). [2] Brown, R. G., Statistical Forecasting for Inventory Control (McGraw-Hill, New York, 1959). [3] Denicoff, M., J. Fennell, S. E. Haber, W. H. Marlow, F. W. Segel, and H. Solomon, "The Polaris Military Essentiality System," Nav. Res. Log. Quart. 11, 235-257 (1964). 546 [4] Denicoff, M., J. Fennell, S. E. Haber, W. H. Marlow, and H. Solomon, "A Polaris Logistics Model," Nav. Res. Log. Quart. 11, 259-272 (1964). [5] Haber, S. E., R. Sitgreaves, and H. Solomon, "A Demand Prediction Technique for Items in Mili- tary Inventory Systems," Nav. Res. Log. Quart. 16, 302-308 (1969). A STOCHASTIC CONSTRAINED OPTIMAL REPLACEMENT MODEL* Peter J. Kalman Department of Economics State University of New York Stony Brook, N.Y. ABSTRACT In this paper a stochastically constrained replacement model is formulated. This model determines a sequence of replacement dates such that the total "current account" cost of all future costs and capital expenditures over an infinite time horizon for the n initial incum- bent machines is minimized subject to the constraints that an expected number of machines are in a chosen utility class at any point in time. We then indicate one possible solution method for the model. I. INTRODUCTION The paper is structured as follows. We first define the basic notation in section 2. Next, in section 3, an analytical model is developed- which determines a sequence of replacement dates for the ith initial incumbent machine (f=l, 2, . . ., n) such that the total "current account" cost of all future costs and capital expenditures over an infinite time horizon is minimized subject to the constraint that at any point in time there exists a desired expected number of machines in any chosen "utility" class. We then discuss one possible solution (out of many) under specified assumptions. II. NOTATION Suppose there exist n initial incumbent machines. In order to define the model, the following notation is useful: uj, the utilization rate of the jth machine in the sequence of replacements for the £th initial incumbent machine, i=l, . . ., n, j=l, 2, . . . (from here on the underlined is abbreviated by MR I My) ; Li, the age of MRIMy when the decision to replace it is made; lj, the age of MRIMy when it is replaced; t, the age of MRIMy, s£ r «£ /]; C'j, the utility class of MRIM,j (Cj=l, 2, . . ., q in decreasing order of desirability. That is, each machine is represented by an ordinal measure of its utility characterized by one of the above integers. Moreover, every machine belongs to one and only one utility class at any point in time.); Rp a parameter indicating whether MRIMy is new or a modernization *This work was started while the author was on the professional staff of the Center for Naval Analysis. It was completed while the author was a visiting professor at the Naval Postgraduate School. 547 548 P- J- KALMAN T 1 if MRIM is new \ 3 [0 if MRIMy is a modernization/' t'j, the time when a decision to purchase a replacement for MRIMy is made, 0^££J=Soo; xi-(t), the expected number of machines in the ith sequence in utility class k at time t; Xk(t), the expected number of machines in utility class k (k — 1, . . ., q) at time t (0 =£ t =£ °°); Pkj(r), the probability that a machine which was in class k when it began operating will have moved to classy by the time it is t years old; t', the time the latest replacement machine in the ith sequence began operating; t l , the starting time of the latest modernization in the ith sequence. In the model to be presented, the term "replacement" includes "modernization" as well as new machine acquisition. The same applies to "purchase of a replacement". Hence, if MRIMy is "modern- ized", then the modernized version is called "new" and a "new machine" when it enters the process. It follows that "age" represents "duration of time in the process". The symbols L), C) and R] represent decision variables and Cj refers to utility class when new. III. THE MODEL The model defined below allows a three-way choice among not replacing or modernizing or building a new machine at any point in time. Furthermore, the model will determine a sequence of replacement purchase times (i.e., {t[}, j=l, 2, . . ., for each i, i = l, . . ., n) such that the total "current account" cost of all future costs over an infinite time horizon is minimized subject to the constraints that at each point in time there exist a prescribed expected number of machines in each chosen utility class k (k=l, 2, . . ., q) . It is assumed that there exist n initial incumbent machines. The operating expence (cost) function for MRIMtj is (3.1) flits,, tj-u t, R h Cj) where it is understood that the superscript outside a function applies to all the appropriate arguments of that function. Clearly, «j, t, i?j and C'j influence the operating expense of MRIMy. The time at which the decision to replace is made (tj_ 1 ) determines the technological state of advancement of the machine to be installed and hence can also influence </>'. In addition to operating cost, there is one other major category of cost which must be considered — the investment cost {W'f) of MRIMy. This will depend upon whether it is a new machine or a modern- ization and on the utility class chosen for it. That is, (3.2) W)(Rj,Cj). Since machines normally have some salvage value at the time of replacement, W'j is not the net capital cost of a machine. To obtain this, we subtract the salvage value (which depends on the replacement age) of MRIMy. Hence, net capital cost of MRIMy is (3.3) W^R j ,C j )-Sj(lj)e-'- l j. We like to note that salvage value may also depend on absolute time, but for simplicity we omit this. There is one other important type of element to be considered before one can formulate the "current STOCHASTIC CONSTRAINED REPLACEMENT MODEL 549 account" cost. This is the time required to accomplish a replacement of MRIM,j of age L'jif the replace- ment action starts at time t'j. Before defining this relation, we like to point out that if a machine is to be replaced by an acquisition, it will remain in service till the new arrival, whereas it may be immedi- ately withdrawn if it is to receive modernization. The time required to accomplish a replacement of MRIIVLj of age L^ if the replacement action starts at time t'j\s represented by yj(Aj> *>)■ The L) and the t'j are related by l ^0' t\ = ti ,+yi ,(Lj-i,tj-i)+L i .,j^2,i=l,2,. . ., n. For future notational convenience, let y\ represent y'.(Lj, tj),j = 0,l,2, . . .,i=l, . . .,n. Define r as the continuous rate of discount. Also, define the current account cost (w') as the outlay stream that has the same present value as all the cost outlays associated with the ith initial incumbent machine in an infinite chain of replacements. That is, (3.4) w ["foiuo, t, R„, C )e"dr- Sh(l )e- rli i r r '' ?-r<t l + y' ) I Uo 4- p-tit' +yi ) > c m m y,(ui, t , r, /? 1 ,C 1 )e-^T+rj(/? 1 ,C 1 )-S' 1 (/,)e- r 'i <f>' m+i (u m +i, t m , T, R m +i, C rn +l)e~ rT dT+ W' m+1 (R m+ i, C„,+i) ^m + 1 V 'm+1 /<? m+1 + where i= 1, . . . , n, /' = L' m+i if the (m+ l)st machine's replacement in the ith sequence is a modernization, L'm+i + 7m+i if a new machine is built, m = — 1, 0, 1, 2, . . ., and <f>' {u , t, R , C ) is the oper- ating expense function for the ith incumbent machine. Note that the investment cost of the ith incumbent machine, W Q , does not appear in (3.4) since, by hypothesis, the incumbent machine is a sunk investment whose capital cost should not be allowed to influence decisions. The integrals in (3.4) discount the cost stream of each replacement back to the point in time when the machine was new. The exponential term outside each set of brackets then dis- counts all these cost streams back to the present time. Similarly, the second and third terms inside 550 p - •<• KALMAN each bracket, when discounted, determines the present value of all future investment outlays net of salvage values. Finally, I would like to note that equation (3.4) can be reduced to its discrete analogue in the dynamic programming framework under appropriate assumptions. 2 The constraint set will now be formulated. It will be formulated first as an algebraic system and then as a differential equation system. The choice of which formulation to use will depend upon the application of the model. From the definitions (of section 2) it follows that (3-5) *,(*) = i i (M'-MmM). i=l k=\ 7=1,2, . . ., q, with the constraint (3-6) Xj{t)=Xj{t), j=l, . . ., q where Xj is chosen by the decision maker (we specify a way for calculating the jc's below), and [t 1 ] =max {V, ?}, £,. v _ f 1 if the machine in use at time t in the tth incumbent machine's sequence is in utility class./ ^ W ~l0ifitis not. Note that %)(t) is a function of current time t. If a machine in the ith sequence is in the process of being modernized at time t, then £j(0 = for all/. Consider the inner summation first. £j( [t'] ) indicates the utility class which we selected for the machine currently in use at time t in the tth sequence when it was new, or tells us that a modernization is currently being undertaken, and consequently, that no machine in the ith sequence is currently in use (i.e., £i([£']) = for all k). For sequence i in which modernizations are not being undertaken at time t, g k ([«']) = g' k (V) = 1 for one and only one k. If [t '] = }', then t— [f] gives the age of the machine in use at time t in the ith sequence. If [f] = ~t \ the pkj{t— [t']) term is irrelevant since ^j f ([t']) = for all k. Thus, the inner sum tells us, for each sequence i in which a machine is in use at time t, the probability of that machine being in utility classy at time t, given the class of the machine when it was built. Then, with the outer summation, we simply sum over all machines in use at time t. In the above formulation, the constraints are functions of the replacement purchase times ([<'] is either a replacement purchase time or a time at which a replacement machine begins operation, which is related to a replacement purchase time by the function y') and the utility classes which we choose for the replacement machines, as reflected in the values of tj k ( ). The constraints may have a count- able number of discontinuities in an infinite horizon model, since it is likely that there will be a 'bump' every time we are able to choose a utility class for a machine (i.e., when we have a modernization or build a new machine). An alternative formulation of the constraint set will be given via a system of differential equations. This formulation involves the solution to a differential equation system while the above formulation does not. For each sequence i, (i=l, . . ., n) we have the following matrix differential equation: (3.7) ^7=P(t-[tq)X i dt See []]. STOCHASTIC CONSTRAINED REPLACEMENT MODEL 551 where P is a q X q transition probability matrix of transition probabilities Pij(t— [t']), X' is a qX\ matrix of x'jit)^, subject to the initial condition (3.8) X>([t i ])=X'(0) where Xi, is a q X 1 matrix whose 7th element is xj( [t'~\) where 1 if [*'] = l' and we selected utiUty class; *]([*']) ~ \ f° r tr >is machine when it was built, otherwise. That is, if modernization is going on at time t in the ith sequence, we do not have to be concerned about any transition probability between classes because no machine is in use. Otherwise, we look back to see in what class we put the machine in use at time t in the ith sequence when it was new. In system (3.7) they'th equation ^}r=ipxj(t- M)4 (t) + (Pjj(t-[r])-i)xj («), al k=\ J=l, . . ., q states that the net rate of change of the number of machines in ith sequence in classy in an infinitesimal time period equals the total number of machines which enter class; in the ith sequence at time t minus the number of machines in the ith sequence which leave class; for other classes. If we assume that the matrix P is an analytic coefficient matrix, then our finite system of first-order linear differential equations with analytic coefficient functions (3.7) subject to the initial value (3.8) admits a unique analytic-function solution, which can be found explicitly by equating coefficients in the relevant power series. 3 Clearly, there are an infinite number of functions in this class which could be used to represent the transition probabilities as a function of time. Linear or exponential functions are just two elements of this class which we can use. One should note that we have a set of q differential equations for each sequence i. Let us denote: TNDM,, the time of the next decision to modernize a machine in the ith sequence; TNRO,, the time the next replacement machine begins operation in the ith sequence. For fixed i, we may have a different set of initial conditions for each interval of the form [t'] =st< min {TNDM,, TNRO,}, reflecting the series of decisions we can make concerning the class of the i replacement machines. Thus, for each i, we would have at most q non-trivial sets of differential equa- tions to solve, corresponding to the q different non-trivial initial conditions we may have. Of course, a solution corresponding to a given set of initial conditions may apply to different length intervals of the form U'} *£ t < min {TNDM,, TNRO/}. Thus, the x'j(t) will be functions of the replacement purchase times and the utility class decisions. 3 See [2], pp. 89-92. 552 P - J- KALMAN n Then Xj{t) = V x'j(t). However, we could only hope for continuity of Xj(t) for i=l max [t*] *£ t < min {TNDM,, TNRO,}. i i That is to say, xj(t) would be continuous only in intervals te (1 ([*'], min {TNDM;, TNRO,}). The i i same is true of the continuity of Xj{t) in our non-differential equation approach. For example, suppose we have two machines Machine 1 _| | i ;] I t\ tz \ t tz Machine 2 _| I I I [_ t 2 t f 3 Specifically, suppose at £ 2 a decision is made to build a new machine in sequence 1, while t! 2 involves a decision to modernize the machine in use in sequence 2. Thus [t 1 ] — ti, [£ 2 ] = 4Thus n ([**], min {TNDM,, TNRO*}) = {t' 2 , h), i i as denoted by the dotted lines. It is within this interval that no decisions regarding the utility class of the machines are made. Consequently, no "bumps" are experienced. Instead, machines change classes smoothly according to the transition probabilities. It should be emphasized that the main factor behind the two different approaches to the constraint formulation is the definition of the transition probabilities. It seems likely that the first approach is preferable from a practical standpoint. If the transition probability matrix is a constant matrix P it is well known that system (3.7) has the following unique solution k m j (3.12) I^^/'-WK/j where /u-i, . . . , /u* are the eigenvalues of P with multiplicities wii, . . . , m/?, respectively and Vij is the eigenvector associated with /j.j. The solution (3.12) represents the number of machines in class 1(1=1, . . . , q) in the ith sequence at time t. We now formulate the constrained optimization problem of determining a sequence of replacement times such that the total current account cost of all future costs and capital expenditures over an infinite time horizon of the n incumbent machines is minimized subject to the chosen constraints that Xk(t) machines are in utility worth class k at time t (&=1, . . ., q). That is, we want to minimize n V w' where w' is defined by (3.4) subject to the constraints Xj(t)=Xj(t), 7=1, . . ., q which are i=l defined by (3.5). One possible way to proceed is to assume that the constraints are exactly satisfied for all values of t and make a discrete approximation of the objective function. Then we can apply dynamic programming techniques in order to solve the problem. As an exercise we have solved the model under the following assumptions: (1) a constant level of utility over time; (2) a linear operating cost function; STOCHASTIC CONSTRAINED REPLACEMENT MODEL 553 (3) a constant rate of machine utilization; (4) instantaneous replacement; (5) equal economic lives of all replacements after the first; (6) equal investment costs of all machines. IV. ACKNOWLEDGMENT I would like to thank G. E. Bowden, L. Ravenscroft and D. Richter, who are members of the professional staff of the Center for Naval Analysis for their helpful comments. REFERENCES [1] Dreyfus, S., "Dynamic Programming and the Calculus of Variations," Academic Publishing Co., 1965. [2] Hochstadt, Harry, "Differential Equations: A Modern Approach," Holt, Rinehart, Winston Pub- lishing Co., 1964. [3] Hotelling, H., "A General Mathematical Theory of Depreciation," Journal of the American Statistical Association, p. 340, XX, September 1925. [4] Jorgenson, D. W., J. J. McCall, R. Radner, "Optimal Replacement Policy," Rand McNally Pub- lishing Co., 1967. A NOTE ON THE CALCULATION OF EXPECTED TIME-WEIGHTED BACKORDERS OVER A GIVEN INTERVAL Chung-Mei Ho AMC Inventory Research Office ABSTRACT Two formulae are presented for calculating expected time-weighted backorders over a fixed time interval. One formula is a more precise form of a result found in the literature and is found using a direct intuitive approach. The second formula is derived using the steady- state distribution of inventory and is directly compatible with the use of steady-state (R,Q) models. The two formulae are compared and reconciled. 1. INTRODUCTION A rather old problem is considered. An inventory system begins with N units of stocks, all on hand. Inventory is depleted by demands which arrive in a random pattern. All demands occurring when the system is out of stock are backordered. What is the expected number of time-weighted backorders over a period of length L assuming that no stock arrives during the period LI The time-weighting is linear, i.e., a backorder lasting for 2 weeks counts as much as two backorders lasting for 1 week. Typically, of course, L represents a procurement lead time. Two solution techniques are compared. One approach is to deduce the answer from a result obtained in the solution of steady-state (R, Q) inventory models, such as described in Hadley and Whitin [2]. A more direct intuitive approach is presented in section 2. It is found that: (a) for Poisson distributed demand both approaches are equivalent though the formulae found are very different in appearance, (b) when demand is large, and the Normal approxima- tion to the Poisson is used, the results given by the two formulae are not the same, and (c) both give reasonable approximations to the true solution, i.e., solution using Poisson demand. The two formulae improve as R is close to or smaller than mean lead time demand. The primary significance of this work is in reconciling the two approaches. Moreover, the direct approach formula is given in a more precise form than we have come across in the literature before. For example, see "Hanssmann" [3, pp. 47-48]. 2. DIRECT APPROACH -POISSON DEMAND We assume a stationary Poisson demand process. Given the initial on-hand inventory is N units, what is the expected number of time-weighted backorders over the period L? Let Y be the total demand, u be the expected value of demand, and Bi. be the time-weighted backorders over L. Figure 1 follows to help in the derivation. From the above diagram, the total time-weighted backorders are presented by the area fi/.. There are Y — N orders outstanding just before the replenishment arrives. Numbering the customers back- 555 556 C. HO h. mnm Figure 1 wards, let T\ be the waiting time of the (Y — N — i+ l)th customer backordered. Then T\ is the waiting time of the last customer. Let the number of backorders given demand of Y be Bl\Y, then Y-N (2.1) B L \Y= £ Ti E(B L \Y) = Y E(Ti\Y). i=i For a Poisson process, given the total demand over the period, the arrival time of each customer is a random sample from a uniform distribution. For a proof of this statement, see, for example, Karlin [4, chap. 9]. To simplify the derivation and without loss of generality, L is assumed to be 1. The TVs have a Beta distribution,* and (2.2) so thus (2.3) where E(T,\Y) Y+l *<W=?FTi=FTT Y-N = 1,2, . . ., Y-N (Y-N+l) E(B L )=-E (Y-N)(Y-N+1] Y+l 1 » (Y-N)(Y-N+1) = 2 2 YTi P{Y) P(Y) = u,.Y e-"u 3. RESULT IMPLICIT IN SOLUTION OF STEADY STATE (R, Q) MODEL Consider an (/?, Q) system with fixed procurement lead time L. If demand is stationary and Poisson, then using the steady-state approach t the expected total time weighted backorders per year is (3.1) E(SSB yr ) = ±r ]T (Y-R-1)[P(Y)-P(Y+Q)] V Y=R+1 where QO P(Y)=J i p(k). *For reference, see Fisz [1, p. 377]. tSee, for example, Hadley & Whitin [2, p. 184]. CALCULATION OF BACKORDERS 557 Now as Q gets large, the probability that lead time demand will exceed Q approaches 0. If we assume the probability is effectively 0, andL is 1, then (3.2) E(SSB yr )=^ f (Y-R-l)P(Y) and (3.3) E{SSBcy)=^E(SSBy r )=l £ (Y-R-l)P(Y), Y = R + l where E(SSB C y) is expected time-weighted backorders per cycle. A cycle can be defined as the time between initiation of successive procurements. Equation (3.3) follows from the fact that if lead time demand does not exceed Q, there can be at most one procurement order outstanding at a time. If so, this implies cyclic statistics such as cyclic backorders can be determined from yearly statistics as shown.* An implication of the assumption that there is never more than one order outstanding is that all/? assets at the time an order is placed are on hand. Thus, backorders sustained over a cycle are equivalent to backorders sustained during that part of the cycle which begins with the start of the procurement lead time. This establishes an identity between backorders in a cycle and the problem posed in Section 2 (with R substituted for N). n±\ sv» \ rro * 1 * (Y-R)(Y-R + 1) (3-4) E(B L )=E(B cy ) = 2 2 y+l P ^' V — R 4. RECONCILIATION OF DIRECT APPROACH TO STEADY STATE (R, Q) MODEL Both Eqs. (3.3) and (3.4) are expected time-weighted backorders over a procurement lead time. A few steps are needed to show that they are identical. E(SSB cy )=- f (Y-R-l)P(Y) U Y=R+\ =1 1 (Y-R-l,±P(k) /=R+l k=Y =\ X 2 (Y-R-Dp(k). Y=R+\ k=Y Interchanging the order of summation, we get i t p(*> i (Y-R-D k=R+\ Y=R+l 1 x k-R-l =- u 2 P (k) 2 i k=R+l i=o 1 ^ ... (k-R-D(k-R) *See. for example, Hadley & Whitin [2, Chap. 4]. 558 c. ho Substitute Y=k— 1 to get \% iy - R) T R+1) >*r+»> Y = R then use p(Y + 1) = TTT T p(Y) (which is a Poisson property) to obtain 1 ^, (Y-R)(Y-R + 1) 2 2 Y+l ^' which is the same as Eq. (3.4). 5. NORMAL APPROXIMATION TO POISSON DEMAND As total lead time demand increases, the Poisson distribution can be approximated by the Normal. Using the Normal approximation to Eq. (3.4) we have (5..) *«^-jj; g -«ff-« +i) jg)». where ^vfe-(-H^) 2 ). u is the expected lead time demand and cr equals Vu. Using the Normal approximation to Eq. (3.3), we have (5.2) E(SSB cy )=- | (Y-R)G(Y)dY where G(Y)= \ f{t) dt. « J K J Y Let w = G(Y) and dv= (Y-R)dY, and use integration by parts in (5.2). This gives = ^- [ X {Y-RYf{Y)dY, £U J R which is different from Eq. (5.1). The discrepancy is due to the use of Normal approximation. In equating Eqs. (3.3) and (3.4), the key relation of p(Y+l)=y±- [ p(Y), is used which holds only for the Poisson distribution and not, the Normal distribution. The two formulae are essentially the same, and empirically were found to be almost equally effective as approximations to the solution obtained using Poisson distribution of demand. ACKNOWLEDGMENTS I am indebted to Mr. Alan Kaplan and Mr. W. Karl Kruse for their contributions to this work. Appreciation is also extended to Professor Edward Silver for his helpful comments. CALCULATION OF BACKORDERS 559 REFERENCES [1] Fisz, M., Probability Theory and Mathematical Statistics (John Wiley & Sons, Inc., New York and London, 1963). [2] Hadley, G. and Whitin, T. M., Analysis of Inventory Systems (Prentice-Hall, Inc., Englewood Cliffs, N. J., 1963). [3] Hanssmann, F., Operations Research in Production and Inventory Control (John Wiley & Sons, Inc., New York and London, 1962). [4] Karlin, S., A First Course in Stochastic Processes (Academic Press, New York and London, 1966). INDEX OF VOLUME 17 AHSANULLAH, M. and A. K. MD. E. SALEH, "Optimum Allocation of Quantiles in Disjoint Intervals for the Blues of the Parameters of Exponential Distribution when the Sample is Censored in the Middle," Vol. 17, No. 3, Sept. 1970, pp. 331-349. ALTER, R. and B. LIENTZ, "A Note on a Problem of Smirnov: A Graph Theoretic Interpretation," Vol. 17, No. 3, Sept. 1970, pp. 407-408. BALAS, E., "Machine Sequencing: Disjunctive Graphs and Degree-Constrained Subgraphs," Vol. 17, No. 1, Mar. 1970, pp. 1-10. BEGED-DOV, A. G., "Contract Award Analysis by Mathematical Programming," Vol. 17, No. 3, Sept. 1970, pp. 297-307. BELL, C. E., "Multiple Dispatches in a Poisson Process," Vol. 17, No. 1, Mar. 1970, pp. 99-102. BELLMORE, M., G. BENNINGTON and S. LUBORE, "A Network Isolation Algorithm," Vol. 17, No. 4, Dec. 1970, pp. 461-469. BENNINGTON, G., M. BELLMORE and S. LUBORE, "A Network Isolation Algorithm, Vol. 17, No. 4, Dec. 1970 pp. 461-469. BENNINGTON, G., and S. LUBORE, "Resource Allocation for Transportation," Vol. 17, No. 4, Dec. 1970, pp. 471-484. BHASHYAM, N., "Stochastic Duels with Lethal Dose," Vol. 17, No. 3, Sept. 1970, pp. 397-405. BHASHYAM, N., "Stochastic Duels with Nonrepayable Weapons," Vol. 17, No. 1, Mar. 1970, pp. 121-129. BLUMENTHAL, S., "Interval Estimation of the Normal Mean Subject to Restrictions, When the Vari- ance Is Known," Vol. 17, No. 4, Dec. 1970, pp. 485-505. BOWMAN, V. J., JR. and G. L. NEMHAUSER, "A Finiteness Proof for Modified Dantzig Cuts in Integer Programming," Vol. 17, No. 3, Sept. 1970, pp. 309-313. BRANDT, E. B. and D. R. LIMAYE, "MAD: Mathematical Analysis of Downtime," Vol. 17, No. 4, Dec. 1970, pp. 525-534. BROWN, R. G. and G. GERSON, "Decision Rules for Equal Shortage Policies," Vol. 17, No. 3, Sept. 1970, pp. 351-358. BURT, J. M., JR., D. P. GAVER, and M. PERLAS, "Simple Stochastic Networks: Some Problems and Procedures," Vol. 17, No. 4, Dec. 1970, pp. 439-459. COZZOLINO, J. M., "The Optimal Burn-In Testing of Repairable Equipment," Vol. 17, No. 2, June 1970, pp. 167-181. CREMEANS, J. E., R. A. SMITH, and G. R. TYNDALL, "Optimal Multicommodity Network Flows with Resource Allocation," Vol. 17, No. 3, Sept. 1970, pp. 269-279. DAY, J. E. and M. P. HOTTENSTEIN, "Review of Sequencing Research," Vol. 17, No. 1, Mar. 1970, pp. 11-39. DISNEY, R. L. and W. E. MITCHELL, "A Solution for Queues with Instantaneous Jockeying and Other Customer Selection Rules," Vol. 17, No. 3, Sept. 1970, pp. 315-325. DUDEWICZ, E. J., "Confidence Intervals for Ranked Means," Vol. 17, No. 1, Mar. 1970, pp. 69-78. EVANS, J. P., "On Constraint Qualifications in Nonlinear, Programming," Vol. 17, No. 3, Sept. 1970, pp. 281-286. 561 562 GAVER, D. P., J. M. BURT, JR., and M. PERLAS, "Simple Stochastic Networks: Some Problems and Procedures," Vol. 17, No. 4, Dec. 1970, pp. 439-459. GERSON, G. and R. G. BROWN, "Decision Rules for Equal Shortage Policies," Vol. 17, No. 3, Sept. 1970, pp. 351-358. GOODMAN, I. F., "Statistical Quality Control of Information," Vol. 17, No. 3, Sept. 1970, pp. 389-396. HABER, S. E. and R. SITGREAVES, "A Methodology for Estimating Expected Usage of Repair Parts with Application to Parts with No. Usage History," Vol. 17, No. 4, Dec. 1970, pp. 535-546. HARRIS, M. Y., "A Mutual Primal-Dual Linear Programming Algorithm," Vol. 17, No. 2, June 1970, pp. 199-206. HARTMAN, J. K. and L. S. LASDON, "A Generalized Upper Bounding Method for Doubly Coupled Linear Programs," Vol. 17, No. 4, Dec. 1970, pp. 411-429. HITCHCOCK, D. F. and J. B. MACQUEEN, "On Computing the Expected Discounted Return in a Markov Chain," Vol. 17, No. 2, June 1970, pp. 237-241. HO, C, "A Note on the Calculation of Expected Time-Weighted Backorders Over A Given Interval," Vol. 17, No. 4, Dec. 1970, pp. 555-559. HOTTENSTEIN, M. P. and J. E. DAY, "Review of Sequencing Research," Vol. 17, No. 1, Mar. 1970, pp. 11-39. JACQUETTE, S. C, "Suboptimal Ordering Policies Under the Full Cost Criterion," Vol. 17, No. 1, Mar. 1970, pp. 131-132. KALMAN, P. J., "A Stochastic Constrained Optimal Replacement Model" Vol. 17, No. 4, Dec. 1970, pp. 547-553. KAPLAN, A. J., "The Relationship Between Decision Variables and Penalty Cost Parameter in (Q, R) Inventory Models," Vol. 17, No. 2, June 1970, pp. 253-258. LASDON, L. S. and J. K. HARTMAN, "A Generalized Upper Bounding Method for Doubly Coupled Linear Programs," Vol. 17, No. 4, Dec. 1970, pp. 411-429. LIENTZ, B. and R. ALTER, "A Note on a Problem of Smirnov: A Graph Theoretic Interpretation," Vol. 17, No. 3, Sept. 1970, pp. 407-408. LIMAYE, D. R. and E. B. BRANDT, "MAD: Mathematical Analysis of Downtime," Vol. 17, No. 4, Dec. 1970, pp. 525-534. LUBORE, S., M. BELLMORE and G. BENNINGTON, "A Network Isolation Algorithm," Vol. 17, No. 4, Dec. 1970, pp. 461-469. LUBORE, S. and G. BENNINGTON, "Resource Allocation for Transportation," Vol. 17, No. 4, Dec. 1970, pp. 471-484. MCMASTERS, A. W. and T. M. MUSTIN, "Optimal Interdiction of a Supply Network," Vol. 17, No. 3, Sept. 1970, pp. 261-268. MACQUEEN, J. B. and D. F. HITCHCOCK, "On Computing the Expected Discounted Return in a Markov Chain," Vol. 17, No. 2, June 1970, pp. 237-241. MALIK, H. J., "The Distribution of the Product of Two Non-Central Beta Variates," Vol. 17, No. 3, Sept. 1970, pp. 327-330. MANN, N. R., "Computer-Aided Selection of Prior Distributions for Generating Monte Carlo Con- fidence Bounds on System Reliability," Vol. 17, No. 1, Mar. 1970, pp. 41-54. 563 MARKLAND, R. E., "A Comparative Study of Demand Forecasting Techniques for Military Helicopter Spare Parts," Vol. 17, No. 1, Mar. 1970, pp. 103-119. MAZUMDAR, M., "Some Estimates of Reliability Using Interference Theory," Vol. 17, No. 2, June 1970, pp. 159-165. MITCHELL, W. E. and R. L. DISNEY, "A Solution for Queues with Instantaneous Jockeying and Other Customer Selection Rules," Vol. 17, No. 3, Sept. 1970, pp. 315-325. MOGLEWER, S. and C. PAYNE, "A Game Theory Approach to Logistics Allocation," Vol. 17, No. 1, Mar. 1970, pp. 87-97. MOREY, R. C, "Inventory Systems with Imperfect Demand Information," Vol. 17, No. 3, Sept. 1970, pp. 287-295. MUSTIN, T. M. and A. W. MCMASTERS, "Optimal Interdiction of A Supply Network," Vol. 17, No. 3, Sept. 1970, pp. 261-268. NEMHAUSER, G. L. and V. J. BOWMAN, JR., "A Finiteness Proof for Modified Dantzig Cuts in Integer Programming," Vol. 17, No. 3, Sept. 1970, pp. 309-313. NIGHTENGALE, M. E., "The Value Statement," Vol. 17, No. 4, Dec. 1970, pp. 507-514. NOLAN, R. L., "Systems Analysis and Planning-Programming-Budgeting Svstems (PPBS) for Defense Decision Making," Vol. 17, No. 3, Sept. 1970, pp. 359-372. PAYNE, C. and S. MOGLEWER, "A Game Theory Approach to Logistics Allocation," Vol. 17, No. 1, Mar. 1970, pp. 87-97. PERLAS, M., J. M. BURT, JR., and D. P. GAVER, "Simple Stochastic Networks: Some Problems and Procedures," Vol. 17, No. 4, Dec. 1970, pp. 439-459. PRESUTTI, V. J., JR., and R. C. TREPP, "More Ado About Economic Order Quantities (EOQ)," Vol. 17, No. 2, June 1970, pp. 243-251. ROLFE, A. J., "Markov Chain Analysis of a Situation Where Cannibalization is the Only Repair Activity," Vol. 17, No. 2, June 1970, pp. 151-158. SALEH, A. K. MD. E. and M. AHSANULLAH, "Optimum Allocation of Quantiles in Disjoint Intervals for the Blues of the Parameters of Exponential Distribution when the Sample is Censored in the Middle," Vol. 17, No. 3, Sept. 1970, pp. 331-349. SCHAFER, R. E. and N. D. SINGPURWALLA, "A Sequential Bayes Procedure for Reliability Demon- stration," Vol. 17, No. 1, Mar. 1970, pp. 55-67. SCHRADY, D. A., "Operational Definitions of Inventory Record Accuracy," Vol. 17, No. 1, Mar. 1970, pp. 133-142. SCOTT, M., "A Queueing Process with Varying Degree of Service," Vol. 17, No. 4, Dec. 1970. pp. 515— 523. SINGPURWALLA, N. D. and R. E. SCHAFER, "A Sequential Bayes Procedure for Reliability Demon- stration," Vol. 17, No. 1, Mar. 1970, pp. 55-67. SITGREAVES, R. and S. E. HABER, "A Methodology for Estimating Expected Usage of Repair Parts with Application to Parts with No Usage History," Vol. 17, No. 4, Dec. 1970, pp. 535-546. SMITH, R. A., J. E. CREMEANS, and G. R. TYNDALL, "Optimal Multicommodity Network Flows with Resource Allocation," Vol. 17, No. 3, Sept. 1970, pp. 269-279. SPIVEY, W. A. and H. TAMURA, "Goal Programming in Econometrics," Vol. 17, No. 2, June 1970. pp. 183-192. 564 STEINBERG. D. I., "The Fixed Charge Problem," Vol. 17, No. 2, June 1970, pp. 217-235. STERNLIGHT, D., "The Fast Deployment Logistic Ship Project: Economic Design and Decision Techniques," Vol. 17, No. 3, Sept. 1970, pp. 373-387. TAMURA, H. and W. A. SPIVEY, "Goal Programming in Econometrics," Vol. 17, No. 2, June 1970, pp. 183-192. TREPP, R. C. and V. J. PRESUTTI, JR., "More Ado About Economic Order Quantities (EOQ)," Vol. 17, No. 2, June 1970, pp. 243-251. TYNDALL, G. R., J. E. CREMEANS, and R. A. SMITH, "Optimal Multicommodity Network Flows with Resource Allocation," Vol. 17, No. 3, Sept. 1970, pp. 269-279. VON LANZENAUER, C. H., "Production and Employment Scheduling in Multistage Production Systems," Vol. 17, No. 2, June 1970, pp. 193-198. WOLLMER, R. D., "Interception in a Network," Vol. 17, No. 2, June 1970, pp. 207-216. YASUDA, Y., "A Note on the Core of a Cooperative Game without Side Payment," Vol. 17, No. 1, Mar. 1970, pp. 143-149. ZACKS, S., "A Two-Echelon Multi-Station Inventory Model for Navy Applications," Vol. 17, No. 1, Mar. 1970, pp. 79-85. ZWART, P. B., "Nonlinear Programming— The Choice of Direction by Gradient Projection," Vol. 17, No. 4, Dec. 1970, pp. 431-438. O U. S. GOVERNMENT PRINTING OFFICE : 1971 433-702/3 INFORMATION FOR CONTRIBUTORS The NAVAL RESEARCH LOGISTICS QUARTERLY is devoted to the dissemination of scientific information in logistics and will publish research and expository papers, including those in certain areas of mathematics, statistics, and economics, relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations. Manuscripts and other items for publication should be sent to The Managing Editor, NAVAL RESEARCH LOGISTICS QUARTERLY, Office of Naval Research, Arlington, Va. 22217. Each manuscript which is considered to be suitable material tor the QUARTERLY is sent to one or more referees. Manuscripts submitted for publication should be typewritten, double-spaced, and the author should retain a copy. Refereeing may be expedited if an extra copy of the manuscript is submitted with the original. A short abstract (not over 400 words) should accompany each manuscript. This will appear at the head of the published paper in the QUARTERLY. There is no authorization for compensation to authors for papers which have been accepted for publication. Authors will receive 250 reprints of their published papers. Readers are invited to submit to the Managing Editor items of general interest in the field of logistics, for possible publication in the NEWS AND MEMORANDA or NOTES sections of the QUARTERLY. NAVAL RESEARCH DECEMBER 19 LOGISTICS VOL. 17, NO. i QUARTERLY NAVSO P-1278 CONTENTS ARTICLES Pa A Generalized Upper Bounding Method for Doubly Coupled Linear Programs by J. K. 4] Hartman and L. S. Lasdon Nonlinear Programming — The Choice of Direction by Gradient Projection by P. B. Zwart 42 Simple Stochastic Networks: Some Problems and Procedures by J. M. Burt, Jr., D. P. 43 Gaver and M. Perlas A Network Isolation Algorithm by M. Bellmore, G. Bennington and S. Lubore 4( Resource Allocation for Transportation by G. Bennington and S. Lubore 4' Interval Estimation of the Normal Mean Subject to Restrictions, when the Variance Is 48 Known by S. Blumenthal The Value Statement by M. E. Nightengale 5C A Queueing Process with Varying Degree of Service by M. Scott 51 MAD: Mathematical Analysis of Downtime by E. B. Brandt and D. R. Limaye 52 A Methodology for Estimating Expected Usage of Repair Parts with Application to Parts 53 with No Usage History by S. E. Haber and R. Sitgreaves A Stochastic Constrained Optimal Replacement Model by P. J. Kalman 54 A Note on the Calculation of Expected Time-Weighted Backorders over a Given Interval 55 by C. Ho OFFICE OF NAVAL RESEARCH Arlington, Va. 22217